Re: Status and future of npm and go support


Richard Purdie
 

On Tue, 2022-01-18 at 08:54 +0100, Stefan Herbrechtsmeier wrote:
Hi Richard,

Am 17.01.2022 um 23:46 schrieb Richard Purdie:
On Mon, 2022-01-17 at 13:50 +0100, Stefan Herbrechtsmeier wrote:

I really miss a comment from a npm user and a TSC member because Alex
and you propose fundamental changes in OE.
The TSC had a meeting today and talked about this a little. I'm going to give my
opinion and the others on the TSC can chime in if they want to add or differ.
Thanks for your opinions.

Firstly, the c) option of highlighting issues with upstreams and working with
them is important and we need to do that. I'm taking it as a given we will talk
and try and work with them.
None of the solutions works again upstream but in some cases the
opinions (ex. stability vs security) are different and we need a way to
overcome this. The question is how much of OE features we want to reuse
and if we want to support features which are missing in npm (ex. update
indirect dependencies or replace deprecated packages).

In parallel, we need to make solutions which work today. In many ways there
aren't perfect answers here and a) or b) may be appropriate in different cases.
What we as in OE care about in particular is that:

* recipes precisely define the thing they're building such that it is uniquely
identified
Does this include the licenses and license checksums of dependencies?
Does this include the CVE product names of all dependencies?
Ideally we will need this information to be visible to our tools, yes.

* builds from mirrors/archives work, you don't need an online upstream to make a
build (and hence it can be reproduced from archive)
Is this the only OE feature the npm integration should support?
No. I deliberately tried to stress the most important things we needed to
ensure. There are other important things too, e.g. the license handling and CVE
issues. These are also important but not quite as important as the other two
issues I mentioned.

From that perspective I don't really care if the SRC_URI is long and ugly in a
recipe as long as it precisely defines what is being built so it is reproducible
and offline builds/caches work.
Which approach do you prefer?
1) All dependency urls inside the SRC_URI.
2) Language specific lock file beside the recipe in a meta layer.
I suspect that realistically we'll end up with a language specific lock file. If
we can avoid that, great but I suspect we likely can't in some cases.

3) Additional do_fetch_dependency and do_patch_dependency task after
do_patch to reuse the language specific lock file from the recipe source.
I don't prefer that as it breaks our "standard" task structure. I think
conceptually these things belong as part of fetch/unpack/patch.

Obviously individual recipes are nice to have in some cases but in the npm case
where that results in 5000 recipes, it simply won't work with bitbake the way
bitbake works today. We have no plans that will let us scale bitbake to 5000
recipes so we will need to look at the other solutions.
NPM per design use a lot of duplicates, multiple versions and deprecated
packages. Furthermore the packages ship dead code or have missing
license. Without individual recipes it is impossible to fix this issues
or we have to maintain fixes in multiple recipes or fix it outside of OE.
That is really an issue with npm and trying to go against that is going totally
against the flow of that language. I'm not sure we have the resources to be able
to do that?

I agree this does have security implications and I hope over time the npm
community start to realise it but I don't realistically see we can go against it
alone. Ideally for us, we want to at least be able to identify the issues but it
isn't our problem to solve.

Using language specific tools and language specific fetchers is ok and we are
seeing that with the npm shrinkrap and cargo plugins and this is likely the
direction we'll have to move going forward.
The npm shrinkrap fetcher doesn't use a language specific tool. It
extracts the URLs from the lock file and use the normal fetcher to
generate the dependency package tree.
I think that is a pragmatic approach and am fine with that, in fact I quite like
it.

I appreciate there are challenges both ways but does that give an idea of the
direction the TSC envisages?
If I understand you correct you say that the solution of individual
recipes won't work for npm because of the high number of individual
packages and it isn't planed to fix this.
I understand the issues but I think trying to change that goes against the
langage and secondly, it isn't practical with the way bitbake works today. I
certainly don't have the time/resources/priority to change bitbake to work
efficiently with thousands of recipes in every build.

The npm integration need to support the download proxy. What about all 
the other features of OE I mention in my first mail?
I'm trying to give you an idea of the priority of the different features. In a
perfect world we'd support them all but we have limited people with a focus on
this.

Should OE only be a build runner for other package manager?
Not necessarily, no. I've tried to be clear that the solutions will likely vary
depending on circumstance. npm is too granular for us to support on a per recipe
basis. That may not be the case for other languages but they may also have their
own constraints/issues/challenges.

This means you have to manage the dependencies outside of OE. This leads to the
following questions:
1) Should it be possible to use a lock file from the source?
2) Should it be possible to patch the lock file?
3) Should it be possible to patch a dependency?
4) Must a patch be applied to all packages with the same version or
should it be applied to an individual package inside the dependency tree?
4) Should the recipe detect license changes of dependencies?
5) Should the recipetool or build process generate the licenses and
license checksums?
6) Should the recipetool or build process extract the CVE product name
and version?

It would be nice if you could help me to get a clear vision about the
integration of language specific package managers so that I can adapt my
work.
I'm acutely aware that I'm not the one doing this work and I don't really want
to impose impractical constraints. My response has therefore been open ended
deliberately as I don't want to pin us into a corner. I've highlighted the
specific issues I'm aware of, e.g. that thousands of dependencies at the bitbake
level likely won't work and I'm not aware of any way to make that happen right
now. I don't really know the correct answer to some of the above questions.

Ideally, yes, tools would generate the correct license and CVE data. Ideally,
the build would verify that data is still correct, much as we do with the
license checksums for other recipes.

Cheers,

Richard

Join openembedded-architecture@lists.openembedded.org to automatically receive all group messages.