Hi,
the npm and go integration doesn’t support a lot of common OE feature like: * Download proxy * Minimize image size (packet split, single copy, dead code removal, …) * Software version management * Dependency management * License compliance * Vulnerability scanner * SBOM generator
Even the `Download proxy` is only partly supported. The npm packages could download artifacts during compile and Go projects without vendor directory download dependencies during compile.
The current state of npm and Go in OE aren’t complete, and a user need to setup a DevOps chain outside of OE to take over the missing parts. Furthermore, the DevOps chain needs its own download proxy, and npm and Go supports cross compile by itself, so the advantage of the OE integration is minimal.
Based on my work on a npm improvement in the last months I see two possible solutions: a) Handle npm and Go projects like C/C++ or Python projects and create a recipe per project. b) Remove npm and Go support from OE and build artifacts via external DevOps chain.
I think the best solution would be a) because it avoids user specific solution and allows collaboration. A solution between a) and b) isn’t reasonable because it doesn’t solve the problem of an additional DevOps chain and introduce a two-class society for languages.
Does somebody use npm and Go and cares about the missing features?
Any feedback, opinions or interests would be helpful.
Regards Stefan
|
|
Three possible solutions, please:
c) improve npm and go tooling in collaboration with respective upstreams so that it fulfils our use cases.
Both a and b are not tenable in my opinion.
Alex
toggle quoted message
Show quoted text
Hi,
the npm and go integration doesn’t support a lot of common OE feature like:
* Download proxy
* Minimize image size (packet split, single copy, dead code removal, …)
* Software version management
* Dependency management
* License compliance
* Vulnerability scanner
* SBOM generator
Even the `Download proxy` is only partly supported. The npm packages
could download artifacts during compile and Go projects without vendor
directory download dependencies during compile.
The current state of npm and Go in OE aren’t complete, and a user need
to setup a DevOps chain outside of OE to take over the missing parts.
Furthermore, the DevOps chain needs its own download proxy, and npm and
Go supports cross compile by itself, so the advantage of the OE
integration is minimal.
Based on my work on a npm improvement in the last months I see two
possible solutions:
a) Handle npm and Go projects like C/C++ or Python projects and create a
recipe per project.
b) Remove npm and Go support from OE and build artifacts via external
DevOps chain.
I think the best solution would be a) because it avoids user specific
solution and allows collaboration. A solution between a) and b) isn’t
reasonable because it doesn’t solve the problem of an additional DevOps
chain and introduce a two-class society for languages.
Does somebody use npm and Go and cares about the missing features?
Any feedback, opinions or interests would be helpful.
Regards
Stefan
|
|
Konrad Weihmann <kweihmann@...>
Guess the 3rd possibility is what's most likely - unfortunately there doesn't seem much interest upstream for the way oe builds things, so it might be fight against windmills.
option a) could be doable in the long run (but would at least require upstream to acknowledge the oe way of doing things)
the more I think about it the more option b becomes likely, as there isn't actually a real world consumer of any of these in core - so all the needed quality control remains somewhere else.
for the tooling part, I think we need to enable devtool to create recipes and source definitions separately, even for a whole dependency tree - but I admit I have no idea if that is doable, as the devtool sources has grown over the time into something very hard to read and refactor (and if you would ask me, the above mentioned idea sounds like a full rewrite of devtool, more or less from scratch). (I remember there was an exchange about this idea in late December on some of the lists)
my idea would be to
- drop npm/go support and move it to another place, one where there's actually a consumer of them - get in contact with upstream nodejs/go to discuss ways to build things with the project's definition of reproducibility - start a devtool refactor enabling the creation of multiple source/recipe definitions out of a dependency tree - refactor go and npm support before moving them back into core - add real life consumers to core to avoid future regressions
No idea what that would mean in terms of work and if that is even doable
toggle quoted message
Show quoted text
On 14.01.22 12:16, Alexander Kanavin wrote: Three possible solutions, please: c) improve npm and go tooling in collaboration with respective upstreams so that it fulfils our use cases. Both a and b are not tenable in my opinion. Alex On Fri, 14 Jan 2022 at 11:09, Stefan Herbrechtsmeier <stefan.herbrechtsmeier-oss@... <mailto:stefan.herbrechtsmeier-oss@...>> wrote: Hi, the npm and go integration doesn’t support a lot of common OE feature like: * Download proxy * Minimize image size (packet split, single copy, dead code removal, …) * Software version management * Dependency management * License compliance * Vulnerability scanner * SBOM generator Even the `Download proxy` is only partly supported. The npm packages could download artifacts during compile and Go projects without vendor directory download dependencies during compile. The current state of npm and Go in OE aren’t complete, and a user need to setup a DevOps chain outside of OE to take over the missing parts. Furthermore, the DevOps chain needs its own download proxy, and npm and Go supports cross compile by itself, so the advantage of the OE integration is minimal. Based on my work on a npm improvement in the last months I see two possible solutions: a) Handle npm and Go projects like C/C++ or Python projects and create a recipe per project. b) Remove npm and Go support from OE and build artifacts via external DevOps chain. I think the best solution would be a) because it avoids user specific solution and allows collaboration. A solution between a) and b) isn’t reasonable because it doesn’t solve the problem of an additional DevOps chain and introduce a two-class society for languages. Does somebody use npm and Go and cares about the missing features? Any feedback, opinions or interests would be helpful. Regards Stefan
|
|
If we do seriously embark on making npm/go better, the first step could be to make npm/go write out a reproducible manifest for the licenses
and sub-packages that can be verified against two recipe checksums during fetch, and ensures no further network access is necessary. That alone would make it a viable fetcher. Those manifests could contain all needed information for further processing (e.g. versions, and what depends on what etc.) And yes, it's a bundled self-contained approach, but that matches how the rest of the world is using npm.
Alex
toggle quoted message
Show quoted text
On Fri, 14 Jan 2022 at 12:43, Konrad Weihmann < kweihmann@...> wrote: Guess the 3rd possibility is what's most likely - unfortunately there
doesn't seem much interest upstream for the way oe builds things, so it
might be fight against windmills.
option a) could be doable in the long run (but would at least require
upstream to acknowledge the oe way of doing things)
the more I think about it the more option b becomes likely, as there
isn't actually a real world consumer of any of these in core - so all
the needed quality control remains somewhere else.
for the tooling part, I think we need to enable devtool to create
recipes and source definitions separately, even for a whole dependency
tree - but I admit I have no idea if that is doable, as the devtool
sources has grown over the time into something very hard to read and
refactor (and if you would ask me, the above mentioned idea sounds like
a full rewrite of devtool, more or less from scratch).
(I remember there was an exchange about this idea in late December on
some of the lists)
my idea would be to
- drop npm/go support and move it to another place, one where there's
actually a consumer of them
- get in contact with upstream nodejs/go to discuss ways to build things
with the project's definition of reproducibility
- start a devtool refactor enabling the creation of multiple
source/recipe definitions out of a dependency tree
- refactor go and npm support before moving them back into core
- add real life consumers to core to avoid future regressions
No idea what that would mean in terms of work and if that is even doable
On 14.01.22 12:16, Alexander Kanavin wrote:
> Three possible solutions, please:
>
> c) improve npm and go tooling in collaboration with respective upstreams
> so that it fulfils our use cases.
>
> Both a and b are not tenable in my opinion.
>
> Alex
>
> On Fri, 14 Jan 2022 at 11:09, Stefan Herbrechtsmeier
> <stefan.herbrechtsmeier-oss@...
> <mailto:stefan.herbrechtsmeier-oss@...>> wrote:
>
> Hi,
>
> the npm and go integration doesn’t support a lot of common OE
> feature like:
> * Download proxy
> * Minimize image size (packet split, single copy, dead code removal, …)
> * Software version management
> * Dependency management
> * License compliance
> * Vulnerability scanner
> * SBOM generator
>
> Even the `Download proxy` is only partly supported. The npm packages
> could download artifacts during compile and Go projects without vendor
> directory download dependencies during compile.
>
> The current state of npm and Go in OE aren’t complete, and a user need
> to setup a DevOps chain outside of OE to take over the missing parts.
> Furthermore, the DevOps chain needs its own download proxy, and npm and
> Go supports cross compile by itself, so the advantage of the OE
> integration is minimal.
>
> Based on my work on a npm improvement in the last months I see two
> possible solutions:
> a) Handle npm and Go projects like C/C++ or Python projects and
> create a
> recipe per project.
> b) Remove npm and Go support from OE and build artifacts via external
> DevOps chain.
>
> I think the best solution would be a) because it avoids user specific
> solution and allows collaboration. A solution between a) and b) isn’t
> reasonable because it doesn’t solve the problem of an additional DevOps
> chain and introduce a two-class society for languages.
>
> Does somebody use npm and Go and cares about the missing features?
>
> Any feedback, opinions or interests would be helpful.
>
> Regards
> Stefan
>
>
>
>
>
>
>
|
|
Am 14.01.2022 um 12:16 schrieb Alexander Kanavin: Three possible solutions, please: c) improve npm and go tooling in collaboration with respective upstreams so that it fulfils our use cases. What is your use case? I speak about more than the fetch and most of the OE features aren't support by the official tooling. Furthermore the philosophy is total different (ex.: central vs distribute management, security vs stability). I worked a lot with npm the last months and the npm command isn't really the problem.
|
|
Am 14.01.2022 um 13:18 schrieb Alexander Kanavin: If we do seriously embark on making npm/go better, the first step could be to make npm/go write out a reproducible manifest for the licenses and sub-packages that can be verified against two recipe checksums during fetch, and ensures no further network access is necessary. That alone would make it a viable fetcher. Those manifests could contain all needed information for further processing (e.g. versions, and what depends on what etc.) And yes, it's a bundled self-contained approach, but that matches how the rest of the world is using npm. In the fetch case we could simple use the existing manifest and lock files and add a extra do_fetch_dependencies and do_patch_dependencies task after the do_patch. The files are really simple to parse and translated into multiple fetches. But this will not fix the dependency management and license compliance. Both tools doesn't care about license files or manipulating the dependencies to my knowledge. What is the advantage of a single recipe over multiple recipes if both could be auto generated? Alex On Fri, 14 Jan 2022 at 12:43, Konrad Weihmann <kweihmann@... <mailto:kweihmann@...>> wrote: Guess the 3rd possibility is what's most likely - unfortunately there doesn't seem much interest upstream for the way oe builds things, so it might be fight against windmills. option a) could be doable in the long run (but would at least require upstream to acknowledge the oe way of doing things) the more I think about it the more option b becomes likely, as there isn't actually a real world consumer of any of these in core - so all the needed quality control remains somewhere else. for the tooling part, I think we need to enable devtool to create recipes and source definitions separately, even for a whole dependency tree - but I admit I have no idea if that is doable, as the devtool sources has grown over the time into something very hard to read and refactor (and if you would ask me, the above mentioned idea sounds like a full rewrite of devtool, more or less from scratch). (I remember there was an exchange about this idea in late December on some of the lists) my idea would be to - drop npm/go support and move it to another place, one where there's actually a consumer of them - get in contact with upstream nodejs/go to discuss ways to build things with the project's definition of reproducibility - start a devtool refactor enabling the creation of multiple source/recipe definitions out of a dependency tree - refactor go and npm support before moving them back into core - add real life consumers to core to avoid future regressions No idea what that would mean in terms of work and if that is even doable On 14.01.22 12:16, Alexander Kanavin wrote: > Three possible solutions, please: > > c) improve npm and go tooling in collaboration with respective upstreams > so that it fulfils our use cases. > > Both a and b are not tenable in my opinion. > > Alex > > On Fri, 14 Jan 2022 at 11:09, Stefan Herbrechtsmeier > <stefan.herbrechtsmeier-oss@... <mailto:stefan.herbrechtsmeier-oss@...> > <mailto:stefan.herbrechtsmeier-oss@... <mailto:stefan.herbrechtsmeier-oss@...>>> wrote: > > Hi, > > the npm and go integration doesn’t support a lot of common OE > feature like: > * Download proxy > * Minimize image size (packet split, single copy, dead code removal, …) > * Software version management > * Dependency management > * License compliance > * Vulnerability scanner > * SBOM generator > > Even the `Download proxy` is only partly supported. The npm packages > could download artifacts during compile and Go projects without vendor > directory download dependencies during compile. > > The current state of npm and Go in OE aren’t complete, and a user need > to setup a DevOps chain outside of OE to take over the missing parts. > Furthermore, the DevOps chain needs its own download proxy, and npm and > Go supports cross compile by itself, so the advantage of the OE > integration is minimal. > > Based on my work on a npm improvement in the last months I see two > possible solutions: > a) Handle npm and Go projects like C/C++ or Python projects and > create a > recipe per project. > b) Remove npm and Go support from OE and build artifacts via external > DevOps chain. > > I think the best solution would be a) because it avoids user specific > solution and allows collaboration. A solution between a) and b) isn’t > reasonable because it doesn’t solve the problem of an additional DevOps > chain and introduce a two-class society for languages. > > Does somebody use npm and Go and cares about the missing features? > > Any feedback, opinions or interests would be helpful. > > Regards > Stefan > > > > > > >
|
|
On 2022-01-14 06:16, Alexander Kanavin wrote: Three possible solutions, please: c) improve npm and go tooling in collaboration with respective upstreams so that it fulfils our use cases. Both a and b are not tenable in my opinion.
100% agree. MarkA Alex On Fri, 14 Jan 2022 at 11:09, Stefan Herbrechtsmeier <stefan.herbrechtsmeier-oss@... <mailto:stefan.herbrechtsmeier-oss@...>> wrote: Hi, the npm and go integration doesn’t support a lot of common OE feature like: * Download proxy * Minimize image size (packet split, single copy, dead code removal, …) * Software version management * Dependency management * License compliance * Vulnerability scanner * SBOM generator Even the `Download proxy` is only partly supported. The npm packages could download artifacts during compile and Go projects without vendor directory download dependencies during compile. The current state of npm and Go in OE aren’t complete, and a user need to setup a DevOps chain outside of OE to take over the missing parts. Furthermore, the DevOps chain needs its own download proxy, and npm and Go supports cross compile by itself, so the advantage of the OE integration is minimal. Based on my work on a npm improvement in the last months I see two possible solutions: a) Handle npm and Go projects like C/C++ or Python projects and create a recipe per project. b) Remove npm and Go support from OE and build artifacts via external DevOps chain. I think the best solution would be a) because it avoids user specific solution and allows collaboration. A solution between a) and b) isn’t reasonable because it doesn’t solve the problem of an additional DevOps chain and introduce a two-class society for languages. Does somebody use npm and Go and cares about the missing features? Any feedback, opinions or interests would be helpful. Regards Stefan
|
|
On 2022-01-14 07:18, Alexander Kanavin wrote: If we do seriously embark on making npm/go better, the first step could be to make npm/go write out a reproducible manifest for the licenses and sub-packages that can be verified against two recipe checksums during fetch, and ensures no further network access is necessary. That alone would make it a viable fetcher. Those manifests could contain all needed information for further processing (e.g. versions, and what depends on what etc.) And yes, it's a bundled self-contained approach, but that matches how the rest of the world is using npm.
I can't speak to npm but for go this was where I wanted to see things go. Just as work was done to avoid unexpected downloads of Python eggs I always felt the key to improving go integration was some for of automated SRC_URI generation. Once this would be available it could be leveraged for licensing and such. Stefan, by the way the reason (a) is not possible is that multiple go applications can use a shared 'library' but different versions (or even different git commit ids). The number of recipes and versions of recipes required to support go applications quickly becomes difficult to manage. Sorry, I should have included this in my first email. MarkA Alex On Fri, 14 Jan 2022 at 12:43, Konrad Weihmann <kweihmann@... <mailto:kweihmann@...>> wrote: Guess the 3rd possibility is what's most likely - unfortunately there doesn't seem much interest upstream for the way oe builds things, so it might be fight against windmills. option a) could be doable in the long run (but would at least require upstream to acknowledge the oe way of doing things) the more I think about it the more option b becomes likely, as there isn't actually a real world consumer of any of these in core - so all the needed quality control remains somewhere else. for the tooling part, I think we need to enable devtool to create recipes and source definitions separately, even for a whole dependency tree - but I admit I have no idea if that is doable, as the devtool sources has grown over the time into something very hard to read and refactor (and if you would ask me, the above mentioned idea sounds like a full rewrite of devtool, more or less from scratch). (I remember there was an exchange about this idea in late December on some of the lists) my idea would be to - drop npm/go support and move it to another place, one where there's actually a consumer of them - get in contact with upstream nodejs/go to discuss ways to build things with the project's definition of reproducibility - start a devtool refactor enabling the creation of multiple source/recipe definitions out of a dependency tree - refactor go and npm support before moving them back into core - add real life consumers to core to avoid future regressions No idea what that would mean in terms of work and if that is even doable On 14.01.22 12:16, Alexander Kanavin wrote: > Three possible solutions, please: > > c) improve npm and go tooling in collaboration with respective upstreams > so that it fulfils our use cases. > > Both a and b are not tenable in my opinion. > > Alex > > On Fri, 14 Jan 2022 at 11:09, Stefan Herbrechtsmeier > <stefan.herbrechtsmeier-oss@... <mailto:stefan.herbrechtsmeier-oss@...> > <mailto:stefan.herbrechtsmeier-oss@... <mailto:stefan.herbrechtsmeier-oss@...>>> wrote: > > Hi, > > the npm and go integration doesn’t support a lot of common OE > feature like: > * Download proxy > * Minimize image size (packet split, single copy, dead code removal, …) > * Software version management > * Dependency management > * License compliance > * Vulnerability scanner > * SBOM generator > > Even the `Download proxy` is only partly supported. The npm packages > could download artifacts during compile and Go projects without vendor > directory download dependencies during compile. > > The current state of npm and Go in OE aren’t complete, and a user need > to setup a DevOps chain outside of OE to take over the missing parts. > Furthermore, the DevOps chain needs its own download proxy, and npm and > Go supports cross compile by itself, so the advantage of the OE > integration is minimal. > > Based on my work on a npm improvement in the last months I see two > possible solutions: > a) Handle npm and Go projects like C/C++ or Python projects and > create a > recipe per project. > b) Remove npm and Go support from OE and build artifacts via external > DevOps chain. > > I think the best solution would be a) because it avoids user specific > solution and allows collaboration. A solution between a) and b) isn’t > reasonable because it doesn’t solve the problem of an additional DevOps > chain and introduce a two-class society for languages. > > Does somebody use npm and Go and cares about the missing features? > > Any feedback, opinions or interests would be helpful. > > Regards > Stefan > > > > > > >
|
|
Am 14.01.2022 um 15:15 schrieb Mark Asselstine via lists.openembedded.org: On 2022-01-14 07:18, Alexander Kanavin wrote:
If we do seriously embark on making npm/go better, the first step could be to make npm/go write out a reproducible manifest for the licenses and sub-packages that can be verified against two recipe checksums during fetch, and ensures no further network access is necessary. That alone would make it a viable fetcher. Those manifests could contain all needed information for further processing (e.g. versions, and what depends on what etc.) And yes, it's a bundled self-contained approach, but that matches how the rest of the world is using npm.
I can't speak to npm but for go this was where I wanted to see things go. Just as work was done to avoid unexpected downloads of Python eggs I always felt the key to improving go integration was some for of automated SRC_URI generation. Once this would be available it could be leveraged for licensing and such. Stefan, by the way the reason (a) is not possible is that multiple go applications can use a shared 'library' but different versions (or even different git commit ids). Why is this simpler? The recipes need to list every information about its dependencies. That means you repeat a lot of code and need to change a lot of files if you update a single dependency. The number of recipes and versions of recipes required to support go applications quickly becomes difficult to manage. How one big recipe instead of multi small recipes can solve this problem?
|
|
On 2022-01-14 10:05, Stefan Herbrechtsmeier wrote: Am 14.01.2022 um 15:15 schrieb Mark Asselstine via lists.openembedded.org:
On 2022-01-14 07:18, Alexander Kanavin wrote:
If we do seriously embark on making npm/go better, the first step could be to make npm/go write out a reproducible manifest for the licenses and sub-packages that can be verified against two recipe checksums during fetch, and ensures no further network access is necessary. That alone would make it a viable fetcher. Those manifests could contain all needed information for further processing (e.g. versions, and what depends on what etc.) And yes, it's a bundled self-contained approach, but that matches how the rest of the world is using npm.
I can't speak to npm but for go this was where I wanted to see things go. Just as work was done to avoid unexpected downloads of Python eggs I always felt the key to improving go integration was some for of automated SRC_URI generation. Once this would be available it could be leveraged for licensing and such.
Stefan, by the way the reason (a) is not possible is that multiple go applications can use a shared 'library' but different versions (or even different git commit ids). Why is this simpler? The recipes need to list every information about its dependencies. That means you repeat a lot of code and need to change a lot of files if you update a single dependency. We went through this with go recipes in meta-virt. It didn't work. You end up producing a lot of Yocto Project specific files containing information which is already available in other forms. Throw in the multiple versions issue I described before and you get a mess. Large reviews of content that maintainers will have to waste time to determine what is needed to review and what can be ignored as it is just transposed information... Again, the key is automation, that is what makes things simpler.
The number of recipes and versions of recipes required to support go applications quickly becomes difficult to manage. How one big recipe instead of multi small recipes can solve this problem?
I am not pushing a big recipe. Keep the go recipes much as they are now, but leverage the go tools to generate support artifacts. MarkA
|
|
Hi Mark, Am 14.01.2022 um 16:22 schrieb Mark Asselstine: On 2022-01-14 10:05, Stefan Herbrechtsmeier wrote:
Am 14.01.2022 um 15:15 schrieb Mark Asselstine via lists.openembedded.org:
On 2022-01-14 07:18, Alexander Kanavin wrote:
If we do seriously embark on making npm/go better, the first step could be to make npm/go write out a reproducible manifest for the licenses and sub-packages that can be verified against two recipe checksums during fetch, and ensures no further network access is necessary. That alone would make it a viable fetcher. Those manifests could contain all needed information for further processing (e.g. versions, and what depends on what etc.) And yes, it's a bundled self-contained approach, but that matches how the rest of the world is using npm.
I can't speak to npm but for go this was where I wanted to see things go. Just as work was done to avoid unexpected downloads of Python eggs I always felt the key to improving go integration was some for of automated SRC_URI generation. Once this would be available it could be leveraged for licensing and such.
Stefan, by the way the reason (a) is not possible is that multiple go applications can use a shared 'library' but different versions (or even different git commit ids). Why is this simpler? The recipes need to list every information about its dependencies. That means you repeat a lot of code and need to change a lot of files if you update a single dependency. We went through this with go recipes in meta-virt. It didn't work. You end up producing a lot of Yocto Project specific files containing information which is already available in other forms. Throw in the multiple versions issue I described before and you get a mess. I assume you want to use the version the project recommend and not a single major version. What makes Go so special that the reasons for a single major version are irrelevant? Why don't we use multiple version for C/C++ projects? Large reviews of content that maintainers will have to waste time to determine what is needed to review and what can be ignored as it is just transposed information... Again, the key is automation, that is what makes things simpler. Without structured information any automation is impossible. Does the Go manifest contains all the information a recipe needs (license, CVE product name)? What happens if we detect a CVE in a dependency? How can we fix it? The number of recipes and versions of recipes required to support go applications quickly becomes difficult to manage. How one big recipe instead of multi small recipes can solve this problem? I am not pushing a big recipe. Keep the go recipes much as they are now, but leverage the go tools to generate support artifacts.
What are the artifacts? Does the Go community need this artifacts too? Regard Stefan
|
|
On 2022-01-14 11:35, Stefan Herbrechtsmeier wrote: Hi Mark, Am 14.01.2022 um 16:22 schrieb Mark Asselstine:
On 2022-01-14 10:05, Stefan Herbrechtsmeier wrote:
Am 14.01.2022 um 15:15 schrieb Mark Asselstine via lists.openembedded.org:
On 2022-01-14 07:18, Alexander Kanavin wrote:
If we do seriously embark on making npm/go better, the first step could be to make npm/go write out a reproducible manifest for the licenses and sub-packages that can be verified against two recipe checksums during fetch, and ensures no further network access is necessary. That alone would make it a viable fetcher. Those manifests could contain all needed information for further processing (e.g. versions, and what depends on what etc.) And yes, it's a bundled self-contained approach, but that matches how the rest of the world is using npm.
I can't speak to npm but for go this was where I wanted to see things go. Just as work was done to avoid unexpected downloads of Python eggs I always felt the key to improving go integration was some for of automated SRC_URI generation. Once this would be available it could be leveraged for licensing and such.
Stefan, by the way the reason (a) is not possible is that multiple go applications can use a shared 'library' but different versions (or even different git commit ids). Why is this simpler? The recipes need to list every information about its dependencies. That means you repeat a lot of code and need to change a lot of files if you update a single dependency. We went through this with go recipes in meta-virt. It didn't work. You end up producing a lot of Yocto Project specific files containing information which is already available in other forms. Throw in the multiple versions issue I described before and you get a mess. I assume you want to use the version the project recommend and not a single major version. What makes Go so special that the reasons for a single major version are irrelevant? Why don't we use multiple version for C/C++ projects? Sure, go projects can opt to only use released versions of dependencies, but this doesn't always happen. For now, and possibly into the future, we have to accept this is not the case. Not using the versions recommended in a go project will result in a level in invalidation of their testing and validation as well as CVE tracking. This actually maps to what happens with C/C++ and results in issues like "works for me" when one user happens to build against a slightly different library version than others use and hits an issue. What's more important, having consistently working software for the end user, or enforcing alternative dependencies to be used to fit a model needed to complete a build?
Large reviews of content that maintainers will have to waste time to determine what is needed to review and what can be ignored as it is just transposed information... Again, the key is automation, that is what makes things simpler. Without structured information any automation is impossible. Does the Go manifest contains all the information a recipe needs (license, CVE product name)?
It contains much of it and things that are missing would be valid suggestions to bring up with the golang community. What happens if we detect a CVE in a dependency? How can we fix it? The CVE would be applicable to the upstream project and so with this approach we are in a position to work with the go project to resolve the CVE.
The number of recipes and versions of recipes required to support go applications quickly becomes difficult to manage. How one big recipe instead of multi small recipes can solve this problem? I am not pushing a big recipe. Keep the go recipes much as they are now, but leverage the go tools to generate support artifacts. What are the artifacts?
These would be YP specific artifacts required to perform things like SBOM generation, etc... Does the Go community need this artifacts too? Not the artifacts, but the base information to generate the artifacts would be of interest for the go projects to have since the data could be reused by other, non-YP projects. This is similar to how the near ubiquitous use of Pypi had Python projects line up a complete set of data that was not only useful in Pypi but elsewhere. We have seen a steady improvement in these matters, from poor execution in Java (to support Maven), to Python, Ruby and now the new generation of this type of data in Golang and Rust. We can exploit this in YP to not have to be stuck in the box of writing recipes. Again, just my thoughts, I appreciate the back and forth and remain open to being convinced of alternative ideas in this matter. Mark Regard Stefan
|
|
Hi Mark, Am 14.01.2022 um 17:58 schrieb Mark Asselstine: On 2022-01-14 11:35, Stefan Herbrechtsmeier wrote:
Am 14.01.2022 um 16:22 schrieb Mark Asselstine:
On 2022-01-14 10:05, Stefan Herbrechtsmeier wrote:
Am 14.01.2022 um 15:15 schrieb Mark Asselstine via lists.openembedded.org:
On 2022-01-14 07:18, Alexander Kanavin wrote:
If we do seriously embark on making npm/go better, the first step could be to make npm/go write out a reproducible manifest for the licenses and sub-packages that can be verified against two recipe checksums during fetch, and ensures no further network access is necessary. That alone would make it a viable fetcher. Those manifests could contain all needed information for further processing (e.g. versions, and what depends on what etc.) And yes, it's a bundled self-contained approach, but that matches how the rest of the world is using npm.
I can't speak to npm but for go this was where I wanted to see things go. Just as work was done to avoid unexpected downloads of Python eggs I always felt the key to improving go integration was some for of automated SRC_URI generation. Once this would be available it could be leveraged for licensing and such.
Stefan, by the way the reason (a) is not possible is that multiple go applications can use a shared 'library' but different versions (or even different git commit ids). Why is this simpler? The recipes need to list every information about its dependencies. That means you repeat a lot of code and need to change a lot of files if you update a single dependency. We went through this with go recipes in meta-virt. It didn't work. You end up producing a lot of Yocto Project specific files containing information which is already available in other forms. Throw in the multiple versions issue I described before and you get a mess. I assume you want to use the version the project recommend and not a single major version. What makes Go so special that the reasons for a single major version are irrelevant? Why don't we use multiple version for C/C++ projects? Sure, go projects can opt to only use released versions of dependencies, but this doesn't always happen. For now, and possibly into the future, we have to accept this is not the case. Not using the versions recommended in a go project will result in a level in invalidation of their testing and validation as well as CVE tracking. Who does this work and how long he do it? This actually maps to what happens with C/C++ and results in issues like "works for me" when one user happens to build against a slightly different library version than others use and hits an issue. Because we define the version this doesn't happen for our user. What's more important, having consistently working software for the end user, or enforcing alternative dependencies to be used to fit a model needed to complete a build? What do you mean by this? The problem with the focus on working software is that the security suffers. You rely on others and get into trouble if a vulnerability happens or somebody corrupts its own project. Large reviews of content that maintainers will have to waste time to determine what is needed to review and what can be ignored as it is just transposed information... Again, the key is automation, that is what makes things simpler. Without structured information any automation is impossible. Does the Go manifest contains all the information a recipe needs (license, CVE product name)? It contains much of it and things that are missing would be valid suggestions to bring up with the golang community.
Do you have an example what is missing and how golang should provide it? What happens if we detect a CVE in a dependency? How can we fix it? The CVE would be applicable to the upstream project and so with this approach we are in a position to work with the go project to resolve the CVE.
This means you have to work with every project in the dependency chain until the change reach your root project. The number of recipes and versions of recipes required to support go applications quickly becomes difficult to manage. How one big recipe instead of multi small recipes can solve this problem? I am not pushing a big recipe. Keep the go recipes much as they are now, but leverage the go tools to generate support artifacts. What are the artifacts? These would be YP specific artifacts required to perform things like SBOM generation, etc...
What is needed for the SBOM generation? Does the Go community need this artifacts too? Not the artifacts, but the base information to generate the artifacts would be of interest for the go projects to have since the data could be reused by other, non-YP projects.
What information is missing beside the license? The problem with the license is that you have to trust the maintainer of the repository or you have to guess the license at every build. This is similar to how the near ubiquitous use of Pypi had Python projects line up a complete set of data that was not only useful in Pypi but elsewhere. We have seen a steady improvement in these matters, from poor execution in Java (to support Maven), to Python, Ruby and now the new generation of this type of data in Golang and Rust. We can exploit this in YP to not have to be stuck in the box of writing recipes. Do you really thing YP should switch to a distributed approach? Doesn't Log4Shell, 'colors' and 'faker' shows the disadvantages of this approach? The recipes give you the possibility to fine-tune and override settings and use a unify style for different languages. Again, just my thoughts, I appreciate the back and forth and remain open to being convinced of alternative ideas in this matter. Thanks for your thoughts. Regards Stefan
|
|
Do you really thing YP should switch to a distributed approach? Doesn't
Log4Shell, 'colors' and 'faker' shows the disadvantages of this approach?
Once again, the world has decided that bundling dependencies is the way to build software. That ship has sailed, and we simply don't have the manpower or the influence to change this. What would help is support in the tools for a manifest which would
- protect you from rogue upstreams by locking down and locally caching source trees - generate a useful SBOM that will tell exactly where log4j is being pulled in, and at which version, should another critical vulnerability hit.
The recipe meanwhile would be simple, short and sweet: it only needs the checksum for the source tree and a checksum for the licensing, and we would put the trust into the upstream tooling that it correctly verifies the checksums. Like we already do by trusting git provided by the build host for instance.
Alex
|
|
On 2022-01-14 14:38, Stefan Herbrechtsmeier wrote: Hi Mark, Am 14.01.2022 um 17:58 schrieb Mark Asselstine:
On 2022-01-14 11:35, Stefan Herbrechtsmeier wrote:
Am 14.01.2022 um 16:22 schrieb Mark Asselstine:
On 2022-01-14 10:05, Stefan Herbrechtsmeier wrote:
Am 14.01.2022 um 15:15 schrieb Mark Asselstine via lists.openembedded.org:
On 2022-01-14 07:18, Alexander Kanavin wrote:
If we do seriously embark on making npm/go better, the first step could be to make npm/go write out a reproducible manifest for the licenses and sub-packages that can be verified against two recipe checksums during fetch, and ensures no further network access is necessary. That alone would make it a viable fetcher. Those manifests could contain all needed information for further processing (e.g. versions, and what depends on what etc.) And yes, it's a bundled self-contained approach, but that matches how the rest of the world is using npm.
I can't speak to npm but for go this was where I wanted to see things go. Just as work was done to avoid unexpected downloads of Python eggs I always felt the key to improving go integration was some for of automated SRC_URI generation. Once this would be available it could be leveraged for licensing and such.
Stefan, by the way the reason (a) is not possible is that multiple go applications can use a shared 'library' but different versions (or even different git commit ids). Why is this simpler? The recipes need to list every information about its dependencies. That means you repeat a lot of code and need to change a lot of files if you update a single dependency. We went through this with go recipes in meta-virt. It didn't work. You end up producing a lot of Yocto Project specific files containing information which is already available in other forms. Throw in the multiple versions issue I described before and you get a mess. I assume you want to use the version the project recommend and not a single major version. What makes Go so special that the reasons for a single major version are irrelevant? Why don't we use multiple version for C/C++ projects? Sure, go projects can opt to only use released versions of dependencies, but this doesn't always happen. For now, and possibly into the future, we have to accept this is not the case.
Not using the versions recommended in a go project will result in a level in invalidation of their testing and validation as well as CVE tracking. Who does this work and how long he do it? This work is done at the project level when we use the upstream configuration verbatim. Only if we decide to deviate in the YP would there be additional work. Which is why I propose we stick to what we know the project is using, testing with and shipping in their releases.
This actually maps to what happens with C/C++ and results in issues like "works for me" when one user happens to build against a slightly different library version than others use and hits an issue. Because we define the version this doesn't happen for our user.
The Yocto Project testing coverage is not as extensive as what is done in upstream projects. There are certainly cases where problems exist in application not due to application code but due to library versions in use. Even outside of issues this requires us to duplicate testing that the golang approach avoids (if we replicate the project configuration).
What's more important, having consistently working software for the end user, or enforcing alternative dependencies to be used to fit a model needed to complete a build? What do you mean by this?
What is your motivation for dropping the dependency versions outlined in a go projects go.mod and instead enforcing it build against a version of your choosing? My point is you are attempting to enforce an arbitrary decision that may affect how the software runs, as opposed to sticking with what the upstream project has tested and validated and is known to work. You are worried about the build and not the end user of the application. The problem with the focus on working software is that the security suffers. You rely on others and get into trouble if a vulnerability happens or somebody corrupts its own project. By not build the upstream project as they publish you have made the YP an exception. This does not improve tracking or fixing vulnerabilities.
Large reviews of content that maintainers will have to waste time to determine what is needed to review and what can be ignored as it is just transposed information... Again, the key is automation, that is what makes things simpler. Without structured information any automation is impossible. Does the Go manifest contains all the information a recipe needs (license, CVE product name)? It contains much of it and things that are missing would be valid suggestions to bring up with the golang community. Do you have an example what is missing and how golang should provide it?
What happens if we detect a CVE in a dependency? How can we fix it? The CVE would be applicable to the upstream project and so with this approach we are in a position to work with the go project to resolve the CVE. This means you have to work with every project in the dependency chain until the change reach your root project.
You are phrasing this in an unfair way. Typically dependencies are not linear, nor can you assume it will always be the furthest on a chain, so rarely would a CVE mean you are working with every project in the dependency chain. Beyond your wording, yes you will need to work CVE fixes through a set of dependencies, in the same way you would dealing with recipes. The number of recipes and versions of recipes required to support go applications quickly becomes difficult to manage. How one big recipe instead of multi small recipes can solve this problem? I am not pushing a big recipe. Keep the go recipes much as they are now, but leverage the go tools to generate support artifacts. What are the artifacts? These would be YP specific artifacts required to perform things like SBOM generation, etc... What is needed for the SBOM generation?
Right now in meta-virt we only build up SRC_URI and provide some hints on where to put dependencies such that a build of the main application can be built. We would want to have some additional information such as licensing to complete an SBOM. But I will leave this to those currently working on SBOM to chime in.
Does the Go community need this artifacts too? Not the artifacts, but the base information to generate the artifacts would be of interest for the go projects to have since the data could be reused by other, non-YP projects. What information is missing beside the license? The problem with the license is that you have to trust the maintainer of the repository or you have to guess the license at every build.
This level of trust is universal. I suspect folks have transposed license details from Pypi without digging into source code to validate licensing.
This is similar to how the near ubiquitous use of Pypi had Python projects line up a complete set of data that was not only useful in Pypi but elsewhere. We have seen a steady improvement in these matters, from poor execution in Java (to support Maven), to Python, Ruby and now the new generation of this type of data in Golang and Rust. We can exploit this in YP to not have to be stuck in the box of writing recipes. Do you really thing YP should switch to a distributed approach? Doesn't Log4Shell, 'colors' and 'faker' shows the disadvantages of this approach?
I think a project like YP can only exist at the scale it does or grow larger using a distributed approach. What I am pushing and have pushed in the past is that YP is best served by not repeating work that is already done, freeing up time to improve on the things that it is needed to perform. What YP can do that projects can't is system level testing, where individual packages are typically tested in isolation, YP has the opportunity to test them in concert. But this is another discussion for another day. The recipes give you the possibility to fine-tune and override settings and use a unify style for different languages. And yet I would guess 95% of the python bitbake recipe files are the same set of 5 lines and 1/2 of those are includes. We aren't removing the possibility to make customization by why write recipes for everything when only a few need to be customized? MarkA
Again, just my thoughts, I appreciate the back and forth and remain open to being convinced of alternative ideas in this matter. Thanks for your thoughts. Regards Stefan
|
|
Hi Alex, Am 14.01.2022 um 20:58 schrieb Alexander Kanavin: On Fri, 14 Jan 2022 at 20:38, Stefan Herbrechtsmeier <stefan.herbrechtsmeier-oss@... <mailto:stefan.herbrechtsmeier-oss@...>> wrote: Do you really thing YP should switch to a distributed approach? Doesn't Log4Shell, 'colors' and 'faker' shows the disadvantages of this approach? Once again, the world has decided that bundling dependencies is the way to build software. That ship has sailed, and we simply don't have the manpower or the influence to change this. Do they have a choice? This could also be an opportunity to promote OE as a build system for docker images with all the OE advantages and full control over the dependencies. Does this mean that we no longer try to remove bundle dependencies from C/C++ projects? What would help is support in the tools for a manifest which would - protect you from rogue upstreams by locking down and locally caching source trees - generate a useful SBOM that will tell exactly where log4j is being pulled in, and at which version, should another critical vulnerability hit. The recipe meanwhile would be simple, short and sweet: it only needs the checksum for the source tree and a checksum for the licensing, and we would put the trust into the upstream tooling that it correctly verifies the checksums. This means a total change of the OE philosophy and tooling because you change from central to distribute management and from static to dynamic meta data. Like we already do by trusting git provided by the build host for instance. Do you really compare a program shipped by a distribution with a project in your dependency chain managed by an unknown person or nobody? I thing we have both make clear that we have different options. Any opinions from a TSC member would be helpful because you propose fundamental changes. Regards Stefan
|
|
Hi Mark, Am 14.01.2022 um 21:09 schrieb Mark Asselstine: On 2022-01-14 14:38, Stefan Herbrechtsmeier wrote:
Hi Mark,
Am 14.01.2022 um 17:58 schrieb Mark Asselstine:
On 2022-01-14 11:35, Stefan Herbrechtsmeier wrote:
Am 14.01.2022 um 16:22 schrieb Mark Asselstine:
On 2022-01-14 10:05, Stefan Herbrechtsmeier wrote:
Am 14.01.2022 um 15:15 schrieb Mark Asselstine via lists.openembedded.org:
On 2022-01-14 07:18, Alexander Kanavin wrote:
If we do seriously embark on making npm/go better, the first step could be to make npm/go write out a reproducible manifest for the licenses and sub-packages that can be verified against two recipe checksums during fetch, and ensures no further network access is necessary. That alone would make it a viable fetcher. Those manifests could contain all needed information for further processing (e.g. versions, and what depends on what etc.) And yes, it's a bundled self-contained approach, but that matches how the rest of the world is using npm.
I can't speak to npm but for go this was where I wanted to see things go. Just as work was done to avoid unexpected downloads of Python eggs I always felt the key to improving go integration was some for of automated SRC_URI generation. Once this would be available it could be leveraged for licensing and such.
Stefan, by the way the reason (a) is not possible is that multiple go applications can use a shared 'library' but different versions (or even different git commit ids). Why is this simpler? The recipes need to list every information about its dependencies. That means you repeat a lot of code and need to change a lot of files if you update a single dependency. We went through this with go recipes in meta-virt. It didn't work. You end up producing a lot of Yocto Project specific files containing information which is already available in other forms. Throw in the multiple versions issue I described before and you get a mess. I assume you want to use the version the project recommend and not a single major version. What makes Go so special that the reasons for a single major version are irrelevant? Why don't we use multiple version for C/C++ projects? Sure, go projects can opt to only use released versions of dependencies, but this doesn't always happen. For now, and possibly into the future, we have to accept this is not the case.
Not using the versions recommended in a go project will result in a level in invalidation of their testing and validation as well as CVE tracking. Who does this work and how long he do it? This work is done at the project level when we use the upstream configuration verbatim. Only if we decide to deviate in the YP would there be additional work. Which is why I propose we stick to what we know the project is using, testing with and shipping in their releases. [snip] But why OE and distributions like Debian use a single project version instead of individual dependency versions? I think we have good reasons to use a single version and this reasons are independent of the language or an existing package manager. I really miss a comment from a npm user and a TSC member because Alex and you propose fundamental changes in OE. Regards Stefan
|
|
On Mon, 2022-01-17 at 13:50 +0100, Stefan Herbrechtsmeier wrote: Hi Mark,
Am 14.01.2022 um 21:09 schrieb Mark Asselstine:
On 2022-01-14 14:38, Stefan Herbrechtsmeier wrote:
Hi Mark,
Am 14.01.2022 um 17:58 schrieb Mark Asselstine:
On 2022-01-14 11:35, Stefan Herbrechtsmeier wrote:
Am 14.01.2022 um 16:22 schrieb Mark Asselstine:
On 2022-01-14 10:05, Stefan Herbrechtsmeier wrote:
Am 14.01.2022 um 15:15 schrieb Mark Asselstine via lists.openembedded.org:
On 2022-01-14 07:18, Alexander Kanavin wrote:
If we do seriously embark on making npm/go better, the first step could be to make npm/go write out a reproducible manifest for the licenses and sub-packages that can be verified against two recipe checksums during fetch, and ensures no further network access is necessary. That alone would make it a viable fetcher. Those manifests could contain all needed information for further processing (e.g. versions, and what depends on what etc.) And yes, it's a bundled self-contained approach, but that matches how the rest of the world is using npm.
I can't speak to npm but for go this was where I wanted to see things go. Just as work was done to avoid unexpected downloads of Python eggs I always felt the key to improving go integration was some for of automated SRC_URI generation. Once this would be available it could be leveraged for licensing and such.
Stefan, by the way the reason (a) is not possible is that multiple go applications can use a shared 'library' but different versions (or even different git commit ids). Why is this simpler? The recipes need to list every information about its dependencies. That means you repeat a lot of code and need to change a lot of files if you update a single dependency. We went through this with go recipes in meta-virt. It didn't work. You end up producing a lot of Yocto Project specific files containing information which is already available in other forms. Throw in the multiple versions issue I described before and you get a mess. I assume you want to use the version the project recommend and not a single major version. What makes Go so special that the reasons for a single major version are irrelevant? Why don't we use multiple version for C/C++ projects? Sure, go projects can opt to only use released versions of dependencies, but this doesn't always happen. For now, and possibly into the future, we have to accept this is not the case.
Not using the versions recommended in a go project will result in a level in invalidation of their testing and validation as well as CVE tracking. Who does this work and how long he do it? This work is done at the project level when we use the upstream configuration verbatim. Only if we decide to deviate in the YP would there be additional work. Which is why I propose we stick to what we know the project is using, testing with and shipping in their releases. [snip]
But why OE and distributions like Debian use a single project version instead of individual dependency versions? I think we have good reasons to use a single version and this reasons are independent of the language or an existing package manager.
I really miss a comment from a npm user and a TSC member because Alex and you propose fundamental changes in OE. The TSC had a meeting today and talked about this a little. I'm going to give my opinion and the others on the TSC can chime in if they want to add or differ. Firstly, the c) option of highlighting issues with upstreams and working with them is important and we need to do that. I'm taking it as a given we will talk and try and work with them. In parallel, we need to make solutions which work today. In many ways there aren't perfect answers here and a) or b) may be appropriate in different cases. What we as in OE care about in particular is that: * recipes precisely define the thing they're building such that it is uniquely identified * builds from mirrors/archives work, you don't need an online upstream to make a build (and hence it can be reproduced from archive) From that perspective I don't really care if the SRC_URI is long and ugly in a recipe as long as it precisely defines what is being built so it is reproducible and offline builds/caches work. Obviously individual recipes are nice to have in some cases but in the npm case where that results in 5000 recipes, it simply won't work with bitbake the way bitbake works today. We have no plans that will let us scale bitbake to 5000 recipes so we will need to look at the other solutions. Using language specific tools and language specific fetchers is ok and we are seeing that with the npm shrinkrap and cargo plugins and this is likely the direction we'll have to move going forward. I appreciate there are challenges both ways but does that give an idea of the direction the TSC envisages? Cheers, Richard
|
|
Even SRC_URI need not be long and ugly. Consider the git submodules fetcher: we simpy list the single top level revision in gitsm:// and trust that the git executable, having just that one revision, will both verify source integrity for all submodules and produce a tree suitable for archiving and offline builds.
I don't see why npm can't behave similarly: first produce a shrinkwrap with checksums separately (if not already provided by upstream), then trust that npm (that we can build ourselves) will utilize the shrinkwrap to provide the same guarantees as git does with submodules.
Alex
toggle quoted message
Show quoted text
On Mon, 2022-01-17 at 13:50 +0100, Stefan Herbrechtsmeier wrote:
> Hi Mark,
>
> Am 14.01.2022 um 21:09 schrieb Mark Asselstine:
> >
> >
> > On 2022-01-14 14:38, Stefan Herbrechtsmeier wrote:
> > > Hi Mark,
> > >
> > > Am 14.01.2022 um 17:58 schrieb Mark Asselstine:
> > > > On 2022-01-14 11:35, Stefan Herbrechtsmeier wrote:
> > > > > Am 14.01.2022 um 16:22 schrieb Mark Asselstine:
> > > > > > On 2022-01-14 10:05, Stefan Herbrechtsmeier wrote:
> > > > > > > Am 14.01.2022 um 15:15 schrieb Mark Asselstine via
> > > > > > > lists.openembedded.org:
> > > > > > > >
> > > > > > > >
> > > > > > > > On 2022-01-14 07:18, Alexander Kanavin wrote:
> > > > > > > > > If we do seriously embark on making npm/go better, the first
> > > > > > > > > step could be to make npm/go write out a reproducible manifest
> > > > > > > > > for the licenses and sub-packages that can be verified against
> > > > > > > > > two recipe checksums during fetch, and ensures no further
> > > > > > > > > network access is necessary. That alone would make it a viable
> > > > > > > > > fetcher. Those manifests could contain all needed information
> > > > > > > > > for further processing (e.g. versions, and what depends on what
> > > > > > > > > etc.) And yes, it's a bundled self-contained approach, but that
> > > > > > > > > matches how the rest of the world is using npm.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I can't speak to npm but for go this was where I wanted to see
> > > > > > > > things go. Just as work was done to avoid unexpected downloads of
> > > > > > > > Python eggs I always felt the key to improving go integration was
> > > > > > > > some for of automated SRC_URI generation. Once this would be
> > > > > > > > available it could be leveraged for licensing and such.
> > > > > > > >
> > > > > > > > Stefan, by the way the reason (a) is not possible is that
> > > > > > > > multiple go applications can use a shared 'library' but different
> > > > > > > > versions (or even different git commit ids).
> > > > > > >
> > > > > > > Why is this simpler? The recipes need to list every information
> > > > > > > about its dependencies. That means you repeat a lot of code and
> > > > > > > need to change a lot of files if you update a single dependency.
> > > > > >
> > > > > > We went through this with go recipes in meta-virt. It didn't work.
> > > > > > You end up producing a lot of Yocto Project specific files
> > > > > > containing information which is already available in other forms.
> > > > > > Throw in the multiple versions issue I described before and you get
> > > > > > a mess.
> > > > >
> > > > > I assume you want to use the version the project recommend and not a
> > > > > single major version. What makes Go so special that the reasons for
> > > > > a single major version are irrelevant? Why don't we use multiple
> > > > > version for C/C++ projects?
> > > >
> > > > Sure, go projects can opt to only use released versions of
> > > > dependencies, but this doesn't always happen. For now, and possibly
> > > > into the future, we have to accept this is not the case.
> > > >
> > > > Not using the versions recommended in a go project will result in a
> > > > level in invalidation of their testing and validation as well as CVE
> > > > tracking.
> > >
> > > Who does this work and how long he do it?
> >
> > This work is done at the project level when we use the upstream
> > configuration verbatim. Only if we decide to deviate in the YP would
> > there be additional work. Which is why I propose we stick to what we
> > know the project is using, testing with and shipping in their releases.
>
> [snip]
>
> But why OE and distributions like Debian use a single project version
> instead of individual dependency versions? I think we have good reasons
> to use a single version and this reasons are independent of the language
> or an existing package manager.
>
> I really miss a comment from a npm user and a TSC member because Alex
> and you propose fundamental changes in OE.
The TSC had a meeting today and talked about this a little. I'm going to give my
opinion and the others on the TSC can chime in if they want to add or differ.
Firstly, the c) option of highlighting issues with upstreams and working with
them is important and we need to do that. I'm taking it as a given we will talk
and try and work with them.
In parallel, we need to make solutions which work today. In many ways there
aren't perfect answers here and a) or b) may be appropriate in different cases.
What we as in OE care about in particular is that:
* recipes precisely define the thing they're building such that it is uniquely
identified
* builds from mirrors/archives work, you don't need an online upstream to make a
build (and hence it can be reproduced from archive)
From that perspective I don't really care if the SRC_URI is long and ugly in a
recipe as long as it precisely defines what is being built so it is reproducible
and offline builds/caches work.
Obviously individual recipes are nice to have in some cases but in the npm case
where that results in 5000 recipes, it simply won't work with bitbake the way
bitbake works today. We have no plans that will let us scale bitbake to 5000
recipes so we will need to look at the other solutions.
Using language specific tools and language specific fetchers is ok and we are
seeing that with the npm shrinkrap and cargo plugins and this is likely the
direction we'll have to move going forward.
I appreciate there are challenges both ways but does that give an idea of the
direction the TSC envisages?
Cheers,
Richard
|
|
Hi Richard, Am 17.01.2022 um 23:46 schrieb Richard Purdie: On Mon, 2022-01-17 at 13:50 +0100, Stefan Herbrechtsmeier wrote:
Hi Mark,
Am 14.01.2022 um 21:09 schrieb Mark Asselstine:
On 2022-01-14 14:38, Stefan Herbrechtsmeier wrote:
Hi Mark,
Am 14.01.2022 um 17:58 schrieb Mark Asselstine:
On 2022-01-14 11:35, Stefan Herbrechtsmeier wrote:
Am 14.01.2022 um 16:22 schrieb Mark Asselstine:
On 2022-01-14 10:05, Stefan Herbrechtsmeier wrote:
Am 14.01.2022 um 15:15 schrieb Mark Asselstine via lists.openembedded.org:
On 2022-01-14 07:18, Alexander Kanavin wrote:
If we do seriously embark on making npm/go better, the first step could be to make npm/go write out a reproducible manifest for the licenses and sub-packages that can be verified against two recipe checksums during fetch, and ensures no further network access is necessary. That alone would make it a viable fetcher. Those manifests could contain all needed information for further processing (e.g. versions, and what depends on what etc.) And yes, it's a bundled self-contained approach, but that matches how the rest of the world is using npm.
I can't speak to npm but for go this was where I wanted to see things go. Just as work was done to avoid unexpected downloads of Python eggs I always felt the key to improving go integration was some for of automated SRC_URI generation. Once this would be available it could be leveraged for licensing and such.
Stefan, by the way the reason (a) is not possible is that multiple go applications can use a shared 'library' but different versions (or even different git commit ids). Why is this simpler? The recipes need to list every information about its dependencies. That means you repeat a lot of code and need to change a lot of files if you update a single dependency. We went through this with go recipes in meta-virt. It didn't work. You end up producing a lot of Yocto Project specific files containing information which is already available in other forms. Throw in the multiple versions issue I described before and you get a mess. I assume you want to use the version the project recommend and not a single major version. What makes Go so special that the reasons for a single major version are irrelevant? Why don't we use multiple version for C/C++ projects? Sure, go projects can opt to only use released versions of dependencies, but this doesn't always happen. For now, and possibly into the future, we have to accept this is not the case.
Not using the versions recommended in a go project will result in a level in invalidation of their testing and validation as well as CVE tracking. Who does this work and how long he do it? This work is done at the project level when we use the upstream configuration verbatim. Only if we decide to deviate in the YP would there be additional work. Which is why I propose we stick to what we know the project is using, testing with and shipping in their releases. [snip]
But why OE and distributions like Debian use a single project version instead of individual dependency versions? I think we have good reasons to use a single version and this reasons are independent of the language or an existing package manager.
I really miss a comment from a npm user and a TSC member because Alex and you propose fundamental changes in OE. The TSC had a meeting today and talked about this a little. I'm going to give my opinion and the others on the TSC can chime in if they want to add or differ. Thanks for your opinions. Firstly, the c) option of highlighting issues with upstreams and working with them is important and we need to do that. I'm taking it as a given we will talk and try and work with them. None of the solutions works again upstream but in some cases the opinions (ex. stability vs security) are different and we need a way to overcome this. The question is how much of OE features we want to reuse and if we want to support features which are missing in npm (ex. update indirect dependencies or replace deprecated packages). In parallel, we need to make solutions which work today. In many ways there aren't perfect answers here and a) or b) may be appropriate in different cases. What we as in OE care about in particular is that: * recipes precisely define the thing they're building such that it is uniquely identified Does this include the licenses and license checksums of dependencies? Does this include the CVE product names of all dependencies? * builds from mirrors/archives work, you don't need an online upstream to make a build (and hence it can be reproduced from archive) Is this the only OE feature the npm integration should support? From that perspective I don't really care if the SRC_URI is long and ugly in a recipe as long as it precisely defines what is being built so it is reproducible and offline builds/caches work. Which approach do you prefer? 1) All dependency urls inside the SRC_URI. 2) Language specific lock file beside the recipe in a meta layer. 3) Additional do_fetch_dependency and do_patch_dependency task after do_patch to reuse the language specific lock file from the recipe source. Obviously individual recipes are nice to have in some cases but in the npm case where that results in 5000 recipes, it simply won't work with bitbake the way bitbake works today. We have no plans that will let us scale bitbake to 5000 recipes so we will need to look at the other solutions. NPM per design use a lot of duplicates, multiple versions and deprecated packages. Furthermore the packages ship dead code or have missing license. Without individual recipes it is impossible to fix this issues or we have to maintain fixes in multiple recipes or fix it outside of OE. Using language specific tools and language specific fetchers is ok and we are seeing that with the npm shrinkrap and cargo plugins and this is likely the direction we'll have to move going forward. The npm shrinkrap fetcher doesn't use a language specific tool. It extracts the URLs from the lock file and use the normal fetcher to generate the dependency package tree. I appreciate there are challenges both ways but does that give an idea of the direction the TSC envisages? If I understand you correct you say that the solution of individual recipes won't work for npm because of the high number of individual packages and it isn't planed to fix this. The npm integration need to support the download proxy. What about all the other features of OE I mention in my first mail? Should OE only be a build runner for other package manager? This means you have to manage the dependencies outside of OE. This leads to the following questions: 1) Should it be possible to use a lock file from the source? 2) Should it be possible to patch the lock file? 3) Should it be possible to patch a dependency? 4) Must a patch be applied to all packages with the same version or should it be applied to an individual package inside the dependency tree? 4) Should the recipe detect license changes of dependencies? 5) Should the recipetool or build process generate the licenses and license checksums? 6) Should the recipetool or build process extract the CVE product name and version? It would be nice if you could help me to get a clear vision about the integration of language specific package managers so that I can adapt my work. Regards Stefan
|
|