Date
1 - 2 of 2
[OE-core] Adding more information to the SBOM
Hi Joshua,
nice to meet you! I'm new to this list, and I've always approached Yocto just from the "IP compliance side", so I may miss important pieces of information. That is why Marta encouraged me and is helping me to ask community feedback. Il 2022-09-14 16:56 Joshua Watt ha scritto: On Wed, Sep 14, 2022 at 9:16 AM Marta Rybczynska <rybczynska@...> wrote:This was also my assumption at the beginning. But then I found that thereDear all,I believe we map the binaries to the source code from the -dbg are recipes with multiple upstream sources, which may be combined/mixed together in recipes' WORKDIR. For instance this one: https://git.yoctoproject.org/meta-virtualization/tree/recipes-networking/cni/cni_git.bb SRC_URI = "\ git://github.com/containernetworking/cni.git;branch=main;name=cni;protocol=https \ git://github.com/containernetworking/plugins.git;branch=release-1.1;destsuffix=${S}/src/github.com/containernetworking/plugins;name=plugins;protocol=https \ git://github.com/flannel-io/cni-plugin;branch=main;name=flannel_plugin;protocol=https;destsuffix=${S}/src/github.com/containernetworking/plugins/plugins/meta/flannel \ " (The third source is unpacked in a subdir of the second one) From here I discovered that we can't assume that the first non-local URI is the downloadLocation for all source files, because it is not always the case. Moreover, in the context of our project we also needed to find the upstream sources also for local patches, scripts, etc. added by recipes (i.e. the corresponding layers' repos). You are right, sorry! "real" is meant in the context of our project,Alberto has worked on how to obtain the missing data and now has aPlease be a little careful with the wording; SBoMs have a lot of uses, where we need to make our Fossology Audit Team work on "original" upstream source packages/repos, for a number of reasons (the main being that in Oniro project we have a complex build matrix with a lot of available target machines and quite a number of different overrides depending on the machine, so when it comes to IP compliance we need to aggregate and simplify, otherwise our IP auditors would die :) ) But since our Audit Team, differently from a commercial project, is working fully in the open, also other projects may benefit from this approach: having fully reviewed file-level license data publicly available for quite a number of upstream sources and Yocto layers, a complete source-to-binary tracking system would enable any Yocto projects to get very detailed license information for their images, to automatically detect license incompatibilities between linked binary files, etc. The issue is with components like util-linux, which contains a lot of- carefully describe what is found in a final image (i.e. binary filesIIUC this is the difference between the "Declared" license and the sub-components subject to different licenses; util-linux recipe's license is "GPL-2.0-or-later & LGPL-2.1-or-later & BSD-3-Clause & BSD-4-Clause", but from such information one cannot tell if a particular binary file generated from util-linux is subject to GPL, LGPL, or BSD-3|4-clause. Of course, being able to track upstream sources to binaries at file level would be useless if one doesn't have file-level license information; but since Scancode and Fossology (and our Audit Team) may provide such information, such tracking may become super-useful, in our opinion. Thanks for the suggestion, could you point me to Richard's work?- automatically check license incompatibilities at the binary file level.This seems promising as something that could potentially move into I'll surely look into it. - I would encourage you to not wait to turn this into a bbclassUnderstood :) I'm the newbie here, so any other suggestion is warmly welcome. Regards, Alberto |
|
Joshua Watt
On Wed, Sep 14, 2022 at 12:10 PM Alberto Pianon <alberto@...> wrote:
This is true, but I think that's more of a problem with the inability to express multiple download locations in the SPDX, not that we don't have all the source when we generate the SPDX, correct? I _beleive_ the -dbg package still contains all the source code from all three URLs? Ok, so this makes me wonder: If we implement the better source extraction in OE core, does that help this problem? Is the primary problem that you want the unpatched upstream source code files instead of the patched ones, or is it some other problem? AFAIK, the -dbg package contains the source code we actually compiled..... so I have a hard time understanding what's "incorrect" (or not ideal) about referencing it; but I think I'm missing something important :) Ok, so let me see if I can follow what you want here:You are right, sorry! "real" is meant in the context of our project,Please be a little careful with the wording; SBoMs have a lot of uses, 1) Your Audit Team scans some open source repository, and generates some sort of license report for it 2) You do a Yocto build that builds that repository 3) You want to link the SBoM generated by Yocto back to the report from the Audit Team; specifically, you want be able to trace binaries in the system back to the original source code from Audit Team report? Currently #3 is difficult because 1) Yocto only reports one SRC_URI in the SBoM 2) Binary are tracked back to the as the patched source code (in the -dbg packages), so the checksums may not match the original upstream source code Any other reasons? We also implement (and report) some rudimentary license scanning inThe issue is with components like util-linux, which contains a lot ofIIUC this is the difference between the "Declared" license and the Yocto, but we only look for "SPDX-License-Identifier" tags Thanks for the suggestion, could you point me to Richard's work?- automatically check license incompatibilities at the binary fileThis seems promising as something that could potentially move into |
|