[master][langdale][kirkstone][PATCH v2] rng-tools: backport patch to adjust jitterentropy library to timeout/fail on long delay


Xiangyu Chen
 

Backport patch from upstream[1] to adjust jitter to timeout on init after 5 seconds in the event it takes
to long to gether jitter entropy.This also fix rng-tools take full cpu usage with whole cores on ARM platforms.

[1] https://github.com/nhorman/rng-tools/pull/171/commits/c29424f10a0dcbd18ac25607fa1c81c18a960e81

Signed-off-by: Xiangyu Chen <xiangyu.chen@...>
---
Changes in v2:
* add libgcc in RDEPENDS flag to solve testing failed in core-image-full-cmdline

---
...ropy-library-to-timeout-fail-on-long.patch | 144 ++++++++++++++++++
.../rng-tools/rng-tools_6.15.bb | 2 +
2 files changed, 146 insertions(+)
create mode 100644 meta/recipes-support/rng-tools/rng-tools/0001-Adjust-jitterentropy-library-to-timeout-fail-on-long.patch

diff --git a/meta/recipes-support/rng-tools/rng-tools/0001-Adjust-jitterentropy-library-to-timeout-fail-on-long.patch b/meta/recipes-support/rng-tools/rng-tools/0001-Adjust-jitterentropy-library-to-timeout-fail-on-long.patch
new file mode 100644
index 0000000000..d70c6587aa
--- /dev/null
+++ b/meta/recipes-support/rng-tools/rng-tools/0001-Adjust-jitterentropy-library-to-timeout-fail-on-long.patch
@@ -0,0 +1,144 @@
+From 3f1d6e53985e40cbe4c7380ce503ca2778d4cd9d Mon Sep 17 00:00:00 2001
+From: Neil Horman <nhorman@...>
+Date: Mon, 16 May 2022 13:38:54 -0400
+Subject: [PATCH] Adjust jitterentropy library to timeout/fail on long delay
+
+When running rngd -l its possible, on platforms that have low jitter
+entropy to block for long periods of time. Adjust jitter to timeout on
+init after 5 seconds in the event it takes to long to gether jitter
+entropy
+
+Also while we're at it, I might have a build solution for the presence
+of internal timers. When jitterentropy is built without internal
+timers, jent_notime_init is defined publically, but when it is built
+with timers, its declared as a static symbol, preenting resolution, so
+we can test to see if the function exists. If it does we _don't_ have
+notime support. The logic is a bit backwards, but i think it works
+
+Upstream-Status: Backport from
+[https://github.com/nhorman/rng-tools/pull/171/commits/c29424f10a0dcbd18ac25607fa1c81c18a960e81]
+
+Signed-off-by: Xiangyu Chen <xiangyu.chen@...>
+---
+ configure.ac | 6 ++---
+ rngd_jitter.c | 61 +++++++++++++++++++++++++++++++++++++++------------
+ 2 files changed, 50 insertions(+), 17 deletions(-)
+
+diff --git a/configure.ac b/configure.ac
+index 40008ca..2e12308 100644
+--- a/configure.ac
++++ b/configure.ac
+@@ -94,9 +94,9 @@ AS_IF(
+ AC_SEARCH_LIBS(jent_version,jitterentropy,
+ [AM_CONDITIONAL([JITTER], [true])
+ AC_DEFINE([HAVE_JITTER],1,[Enable JITTER])
+- AC_CHECK_LIB(jitterentropy, jent_entropy_switch_notime_impl,
+- [AC_DEFINE([HAVE_JITTER_NOTIME],1,[Enable JITTER_NOTIME])],
+- [],-lpthread)],
++ AC_CHECK_LIB(jitterentropy, jent_notime_init,
++ [],
++ [AC_DEFINE([HAVE_JITTER_NOTIME],1, [Enable JITTER_NOTIME])],-lpthread)],
+ AC_MSG_NOTICE([No Jitterentropy library found]),-lpthread)
+ ], [AC_MSG_NOTICE([Disabling JITTER entropy source])]
+ )
+diff --git a/rngd_jitter.c b/rngd_jitter.c
+index d1b17ba..3647b7f 100644
+--- a/rngd_jitter.c
++++ b/rngd_jitter.c
+@@ -400,6 +400,8 @@ int init_jitter_entropy_source(struct rng *ent_src)
+ int entflags = 0;
+ int ret;
+ int core_id = 0;
++ struct timespec base, now;
++ int rc;
+
+ signal(SIGUSR1, jitter_thread_exit_signal);
+
+@@ -508,6 +510,10 @@ int init_jitter_entropy_source(struct rng *ent_src)
+ CPU_FREE(cpus);
+ cpus = NULL;
+
++ flags = fcntl(pipefds[0], F_GETFL, 0);
++ flags |= O_NONBLOCK;
++ fcntl(pipefds[0], F_SETFL, flags);
++
+ if (ent_src->rng_options[JITTER_OPT_USE_AES].int_val) {
+ /*
+ * Temporarily disable aes so we don't try to use it during init
+@@ -516,32 +522,59 @@ int init_jitter_entropy_source(struct rng *ent_src)
+ message_entsrc(ent_src,LOG_CONS|LOG_INFO, "Initializing AES buffer\n");
+ aes_buf = malloc(tdata[0].buf_sz);
+ ent_src->rng_options[JITTER_OPT_USE_AES].int_val = 0;
+- if (xread_jitter(key, AES_BLOCK, ent_src)) {
+- message_entsrc(ent_src,LOG_CONS|LOG_INFO, "Unable to obtain AES key, disabling AES in JITTER source\n");
+- } else if (xread_jitter(iv_buf, CHUNK_SIZE, ent_src)) {
+- message_entsrc(ent_src,LOG_CONS|LOG_INFO, "Unable to obtain iv_buffer, disabling AES in JITTER source\n");
++ clock_gettime(CLOCK_REALTIME, &base);
++ do {
++ rc = xread_jitter(key, AES_BLOCK, ent_src);
++ clock_gettime(CLOCK_REALTIME, &now);
++ } while (rc && ((now.tv_sec - base.tv_sec) < 5));
++
++ if (rc) {
++ message_entsrc(ent_src,LOG_CONS|LOG_INFO, "Unable to obtain AES key, disabling JITTER source\n");
++ close_jitter_entropy_source(ent_src);
++ return 1;
++ }
++ do {
++ rc = xread_jitter(iv_buf, CHUNK_SIZE, ent_src);
++ clock_gettime(CLOCK_REALTIME, &now);
++ } while (rc && ((now.tv_sec - base.tv_sec) < 5));
++
++ if (rc) {
++ message_entsrc(ent_src,LOG_CONS|LOG_INFO, "Unable to obtain iv_buffer, disabling JITTER source\n");
++ close_jitter_entropy_source(ent_src);
++ return 1;
+ } else {
+ /* re-enable AES */
+ ent_src->rng_options[JITTER_OPT_USE_AES].int_val = 1;
+ ossl_ctx = ossl_aes_init(key, iv_buf);
+ }
+- xread_jitter(aes_buf, tdata[0].buf_sz, ent_src);
++
++ do {
++ rc = xread_jitter(aes_buf, tdata[0].buf_sz, ent_src);
++ clock_gettime(CLOCK_REALTIME, &now);
++ } while (rc && ((now.tv_sec - base.tv_sec) < 5));
++ if (rc) {
++ message_entsrc(ent_src,LOG_CONS|LOG_INFO, "Unable to obtain aes buffer, disabling JITTER source\n");
++ close_jitter_entropy_source(ent_src);
++ return 1;
++ }
++
+ } else {
+ /*
+- * Make sure that an entropy gathering thread has generated
+- * at least some entropy before setting O_NONBLOCK and finishing
+- * the entropy source initialization.
+- *
+ * This avoids "Entropy Generation is slow" log spamming that
+ * would otherwise happen until jent_read_entropy() has run
+ * for the first time.
+ */
+- xread_jitter(&i, 1, ent_src);
+- }
++ do {
++ rc = xread_jitter(&i, 1, ent_src);
++ clock_gettime(CLOCK_REALTIME, &now);
++ } while (rc && ((now.tv_sec - base.tv_sec) < 5));
++ if (rc) {
++ message_entsrc(ent_src,LOG_CONS|LOG_INFO, "Unable to prime jitter source, disabling JITTER source\n");
++ close_jitter_entropy_source(ent_src);
++ return 1;
++ }
+
+- flags = fcntl(pipefds[0], F_GETFL, 0);
+- flags |= O_NONBLOCK;
+- fcntl(pipefds[0], F_SETFL, flags);
++ }
+
+ message_entsrc(ent_src,LOG_DAEMON|LOG_INFO, "Enabling JITTER rng support\n");
+ return 0;
+--
+2.34.1
+
diff --git a/meta/recipes-support/rng-tools/rng-tools_6.15.bb b/meta/recipes-support/rng-tools/rng-tools_6.15.bb
index efc08b5e0a..d16675730a 100644
--- a/meta/recipes-support/rng-tools/rng-tools_6.15.bb
+++ b/meta/recipes-support/rng-tools/rng-tools_6.15.bb
@@ -7,11 +7,13 @@ BUGTRACKER = "https://github.com/nhorman/rng-tools/issues"
LICENSE = "GPL-2.0-only"
LIC_FILES_CHKSUM = "file://COPYING;md5=b234ee4d69f5fce4486a80fdaf4a4263"
DEPENDS = "sysfsutils openssl"
+RDEPENDS:${PN} = "libgcc"

SRC_URI = "git://github.com/nhorman/rng-tools.git;branch=master;protocol=https \
file://init \
file://default \
file://rng-tools.service \
+ file://0001-Adjust-jitterentropy-library-to-timeout-fail-on-long.patch \
"
SRCREV = "381f69828b782afda574f259c1b7549f48f9bb77"

--
2.17.1


Alexander Kanavin
 

On Sun, 27 Nov 2022 at 14:39, Xiangyu Chen
<xiangyu.chen@...> wrote:
* add libgcc in RDEPENDS flag to solve testing failed in core-image-full-cmdline
+RDEPENDS:${PN} = "libgcc"
This needs to be better investigated first. The change looks like
solving the symptom, but not the actual issue.

Alex


Ross Burton
 

On 28 Nov 2022, at 09:25, Alexander Kanavin via lists.openembedded.org <alex.kanavin=gmail.com@...> wrote:

On Sun, 27 Nov 2022 at 14:39, Xiangyu Chen
<xiangyu.chen@...> wrote:
* add libgcc in RDEPENDS flag to solve testing failed in core-image-full-cmdline
+RDEPENDS:${PN} = "libgcc"
This needs to be better investigated first. The change looks like
solving the symptom, but not the actual issue.
It’s not uncommon: https://bugzilla.yoctoproject.org/show_bug.cgi?id=10954

Ross


Randy MacLeod
 

On 2022-11-29 10:52, Ross Burton via lists.openembedded.org wrote:
On 28 Nov 2022, at 09:25, Alexander Kanavin via lists.openembedded.org <alex.kanavin=gmail.com@...> wrote:

On Sun, 27 Nov 2022 at 14:39, Xiangyu Chen
<xiangyu.chen@...> wrote:
* add libgcc in RDEPENDS flag to solve testing failed in core-image-full-cmdline
+RDEPENDS:${PN} = "libgcc"
This needs to be better investigated first. The change looks like
solving the symptom, but not the actual issue.
It’s not uncommon: https://bugzilla.yoctoproject.org/show_bug.cgi?id=10954
While it would be good to solve the general case of libgcc dependency mentioned in that case,
I do wonder if the original issue that Xiangyu saw, goes away if we _only_ add:

+RDEPENDS:${PN} = "libgcc"

to rngtools without added the upstream timeout after 5 seconds or
it the dependency is only required with the patch?

Xiangyu ?


I also don't like the arbitrary timeout and I'd like references
from the board vendor that explains that proper RNG isn't supported!
( I'm not a BSP person so this may be optimistic! )


Note that there is a follow-up patch to make the timeout configurable, see below.
If we take the original patch, we should take both and require that the BSP tune
the timeout. We'll get this behaviour once the patches are integrated into a release
so we might as well figure out if we're going to:
1. have a tuning policy,
2. continually revert the patch or
3. actually get to the root cause of the problem (by working with board vendors I expect).


../Randy

commit 6e14b8bd491b3f94f60f36748959e0ae7d7c1f95
Author: Dan Horák <dan@...>
Date: Mon Jul 18 04:47:41 2022

rngd_jitter: make timeout configurable

The current default 5 sec timeout can be too short in some situations
(eg. slow hardware). Introduce a command line option (jitter:timeout)
to make the value configurable. Increasing the default timeout is not
desirable, because it might slow down system startup when there are
other entropy sources.

Signed-off-by: Dan Horák <dan@...>

commit c29424f10a0dcbd18ac25607fa1c81c18a960e81 (origin/jitter-init)
Author: Neil Horman <nhorman@...>
Date: Mon May 16 13:38:54 2022

Adjust jitterentropy library to timeout/fail on long delay

When running rngd -l its possible, on platforms that have low jitter
entropy to block for long periods of time. Adjust jitter to timeout on
init after 5 seconds in the event it takes to long to gether jitter
entropy

Also while we're at it, I might have a build solution for the presence
of internal timers. When jitterentropy is built without internal
timers, jent_notime_init is defined publically, but when it is built
with timers, its declared as a static symbol, preenting resolution, so
we can test to see if the function exists. If it does we _don't_ have
notime support. The logic is a bit backwards, but i think it works


Ross
--
# Randy MacLeod
# Wind River Linux


Xiangyu Chen
 

On 11/30/22 06:28, Randy MacLeod wrote:
On 2022-11-29 10:52, Ross Burton via lists.openembedded.org wrote:
On 28 Nov 2022, at 09:25, Alexander Kanavin via lists.openembedded.org <alex.kanavin=gmail.com@...> wrote:

On Sun, 27 Nov 2022 at 14:39, Xiangyu Chen
<xiangyu.chen@...> wrote:
  * add libgcc in RDEPENDS flag to solve testing failed in core-image-full-cmdline
+RDEPENDS:${PN} = "libgcc"
This needs to be better investigated first. The change looks like
solving the symptom, but not the actual issue.
It’s not uncommon: https://bugzilla.yoctoproject.org/show_bug.cgi?id=10954
While it would be good to solve the general case of libgcc dependency mentioned in that case,
I do wonder if the original issue that Xiangyu saw, goes away if we _only_ add:

 +RDEPENDS:${PN} = "libgcc"

to rngtools without added the upstream timeout after 5 seconds or
it the dependency is only required with the patch?

Xiangyu ?
The software jitter thread will call the pthread_exit after timeout value reached, the code has already exists there but normally won't be called,  when applying this patch, the timeout logic can go there, that's the reason missing libgcc which was reported by oe testing



I also don't like the arbitrary timeout and I'd like references
from the board vendor that explains that proper RNG isn't supported!
( I'm not a BSP person so this may be optimistic! )
Note that there is a follow-up patch to make the timeout configurable, see below.
If we take the original patch, we should take both and require that the BSP tune
the timeout. We'll get this behaviour once the patches are integrated into a release
so we might as well figure out if we're going to:
 1. have a tuning policy,
 2. continually revert the patch or
 3. actually get to the root cause of the problem (by working with board vendors I expect).
After investigating this patch, it just add a mechanism to avoid jitter thread take long time, but not "really" solve the jitter thread full cpu usage when it start.  But we still need to think about the timeout value because this commit will contain in next rng-tool release version.

There is another question need to clarify that whether rng-tool built-in jitter random generator still need after kernel 5.6, the /dev/random doesn't block anymore, perhaps we can turn built-in jitter random generator off by default service start parameter because it would take full CPU cores effort to generate jitter random when it start.


Br,

Xiangyu


../Randy

commit 6e14b8bd491b3f94f60f36748959e0ae7d7c1f95
Author: Dan Horák <dan@...>
Date:   Mon Jul 18 04:47:41 2022

    rngd_jitter: make timeout configurable

    The current default 5 sec timeout can be too short in some situations
    (eg. slow hardware). Introduce a command line option (jitter:timeout)
    to make the value configurable. Increasing the default timeout is not
    desirable, because it might slow down system startup when there are
    other entropy sources.

    Signed-off-by: Dan Horák <dan@...>

commit c29424f10a0dcbd18ac25607fa1c81c18a960e81 (origin/jitter-init)
Author: Neil Horman <nhorman@...>
Date:   Mon May 16 13:38:54 2022

    Adjust jitterentropy library to timeout/fail on long delay

    When running rngd -l its possible, on platforms that have low jitter
    entropy to block for long periods of time.  Adjust jitter to timeout on
    init after 5 seconds in the event it takes to long to gether jitter
    entropy

    Also while we're at it, I might have a build solution for the presence
    of internal timers.  When jitterentropy is built without internal
    timers, jent_notime_init is defined publically, but when it is built
    with timers, its declared as a static symbol, preenting resolution, so
    we can test to see if the function exists.  If it does we _don't_ have
    notime support. The logic is a bit backwards, but i think it works



Ross