Y2038 proposal


Alexander Kanavin
 

On Tue, 29 Nov 2022 at 16:45, Stephen Jolley <sjolley.yp.pm@...> wrote:
We’d welcome a proposal/series on how to move forward with the Y2038 work for 32 bit platforms.
I have the following proposal:

1. A branch is made where:
a. "-D_TIME_BITS=64 -D_FILE_OFFSET_BITS=64" is enabled globally.
b. qemu is always started with "-rtc base=2040-01-01", simulating
Y2038 actually occurring.
c. an additional runtime test verifies that both RTC clock and system
clock report 2040.

2. This branch is run through a-full on the autobuilder. Any uncovered
issues are filed as bugs.

3. Once *all* of the bugs are addressed, repeat point 2.

4. Once there are no more open bugs, 1a is merged into master.

Any fatal flaws in the plan?

It's not hard to see that Y2038 problem is real and serious, e.g. on
qemux86 core-image-full-cmdline built from master:

root@qemux86:~# ls /
bin boot dev etc home lib lost+found media mnt proc
run sbin sys tmp usr var
root@qemux86:~# date -s "2040-01-01"
Sun Jan 1 00:00:00 UTC 2040
root@qemux86:~# ls /
bin boot dev etc home lib lost+found media mnt proc
run sbin sys tmp usr var
root@qemux86:~# ls /
-sh: ls: command not found

On qemux86_64 the same sequence works as expected, of course.

Alex


Richard Purdie
 

On Wed, 2022-11-30 at 09:07 +0100, Alexander Kanavin wrote:
On Tue, 29 Nov 2022 at 16:45, Stephen Jolley <sjolley.yp.pm@...> wrote:
We’d welcome a proposal/series on how to move forward with the Y2038 work for 32 bit platforms.
I have the following proposal:

1. A branch is made where:
a. "-D_TIME_BITS=64 -D_FILE_OFFSET_BITS=64" is enabled globally.
b. qemu is always started with "-rtc base=2040-01-01", simulating
Y2038 actually occurring.
c. an additional runtime test verifies that both RTC clock and system
clock report 2040.

2. This branch is run through a-full on the autobuilder. Any uncovered
issues are filed as bugs.

3. Once *all* of the bugs are addressed, repeat point 2.

4. Once there are no more open bugs, 1a is merged into master.

Any fatal flaws in the plan?
Others have made some good comments. My thoughts:

* We need to add some runtime tests to oeqa for this (in addition to
the ptests)

* We need to have a 32 bit ptest run on the autobuilder (qemux86 should
work, not sure we can make qemuarm fast). Whether this is manually
triggered, not sure. We could have a smaller set of ptests to run for
it?

* Could we optionally disable some of the glibc 32 bit function calls
to ensure they're not being used? We don't really want to diverge from
upstream glibc much though.

* We need to work out how to communicate this change happened and have
people "buy in" to it. The reason for that is that if someone has
existing binaries, there could be problems using them after the change.
We therefore need to be sure they are aware of it.

Cheers,

Richard


Khem Raj
 

On Wed, Nov 30, 2022 at 12:08 AM Alexander Kanavin
<alex.kanavin@...> wrote:

On Tue, 29 Nov 2022 at 16:45, Stephen Jolley <sjolley.yp.pm@...> wrote:
We’d welcome a proposal/series on how to move forward with the Y2038 work for 32 bit platforms.
I have the following proposal:

1. A branch is made where:
a. "-D_TIME_BITS=64 -D_FILE_OFFSET_BITS=64" is enabled globally.
I have something like this on yoe/mut branch on contrib repo ( due to
musl removing the LFS hacks).
However there are packages which need to be fixed at build time.

b. qemu is always started with "-rtc base=2040-01-01", simulating
Y2038 actually occurring.
this is a good time machine :)

c. an additional runtime test verifies that both RTC clock and system
clock report 2040.

2. This branch is run through a-full on the autobuilder. Any uncovered
issues are filed as bugs.

3. Once *all* of the bugs are addressed, repeat point 2.

4. Once there are no more open bugs, 1a is merged into master.

Any fatal flaws in the plan?
Not much issues except that package fixes may need to be carried
locally for a while.

It's not hard to see that Y2038 problem is real and serious, e.g. on
qemux86 core-image-full-cmdline built from master:

root@qemux86:~# ls /
bin boot dev etc home lib lost+found media mnt proc
run sbin sys tmp usr var
root@qemux86:~# date -s "2040-01-01"
Sun Jan 1 00:00:00 UTC 2040
root@qemux86:~# ls /
bin boot dev etc home lib lost+found media mnt proc
run sbin sys tmp usr var
root@qemux86:~# ls /
-sh: ls: command not found

On qemux86_64 the same sequence works as expected, of course.

Alex



Alexander Kanavin
 

On Wed, 30 Nov 2022 at 14:15, Richard Purdie
<richard.purdie@...> wrote:
* We need to have a 32 bit ptest run on the autobuilder (qemux86 should
work, not sure we can make qemuarm fast). Whether this is manually
triggered, not sure. We could have a smaller set of ptests to run for
it?
I just ran qemux86 full ptest locally. It took 4h:10m (same as
qemuarm64 ptest on an arm worker). The fails were:

{'python3': ['test_deterministic_sets'],
'valgrind': ['gdbserver_tests/hgtls',
'gdbserver_tests/mcblocklistsearch',
'gdbserver_tests/mcbreak',
'gdbserver_tests/mcclean_after_fork',
'gdbserver_tests/mchelp',
'gdbserver_tests/mcinfcallRU',
'gdbserver_tests/mcinfcallWSRU',
'gdbserver_tests/mcinvokeRU',
'gdbserver_tests/mcinvokeWS',
'gdbserver_tests/mcleak',
'gdbserver_tests/mcmain_pic',
'gdbserver_tests/mcsignopass',
'gdbserver_tests/mcsigpass',
'gdbserver_tests/mcvabits',
'gdbserver_tests/mcwatchpoints',
'gdbserver_tests/mssnapshot',
'gdbserver_tests/nlcontrolc',
'gdbserver_tests/nlgone_abrt',
'gdbserver_tests/nlgone_exit',
'gdbserver_tests/nlgone_return',
'gdbserver_tests/nlpasssigalrm',
'gdbserver_tests/nlsigvgdb',
'gdbserver_tests/nlvgdbsigqueue',
'memcheck/tests/linux/memfd_create',
'memcheck/tests/linux/timerfd-syscall',
'memcheck/tests/origin5-bz2',
'massif/tests/mmapunmap']}

So I think we could as well fix these, and add full qemux86 ptest to
a-full? It is not heavy on the builder machine (mostly just runs a
single qemu thread), it's just long.

Alex


Richard Purdie
 

On Thu, 2022-12-01 at 11:27 +0100, Alexander Kanavin wrote:
On Wed, 30 Nov 2022 at 14:15, Richard Purdie
<richard.purdie@...> wrote:
* We need to have a 32 bit ptest run on the autobuilder (qemux86 should
work, not sure we can make qemuarm fast). Whether this is manually
triggered, not sure. We could have a smaller set of ptests to run for
it?
I just ran qemux86 full ptest locally. It took 4h:10m (same as
qemuarm64 ptest on an arm worker). The fails were:

{'python3': ['test_deterministic_sets'],
'valgrind': ['gdbserver_tests/hgtls',
'gdbserver_tests/mcblocklistsearch',
'gdbserver_tests/mcbreak',
'gdbserver_tests/mcclean_after_fork',
'gdbserver_tests/mchelp',
'gdbserver_tests/mcinfcallRU',
'gdbserver_tests/mcinfcallWSRU',
'gdbserver_tests/mcinvokeRU',
'gdbserver_tests/mcinvokeWS',
'gdbserver_tests/mcleak',
'gdbserver_tests/mcmain_pic',
'gdbserver_tests/mcsignopass',
'gdbserver_tests/mcsigpass',
'gdbserver_tests/mcvabits',
'gdbserver_tests/mcwatchpoints',
'gdbserver_tests/mssnapshot',
'gdbserver_tests/nlcontrolc',
'gdbserver_tests/nlgone_abrt',
'gdbserver_tests/nlgone_exit',
'gdbserver_tests/nlgone_return',
'gdbserver_tests/nlpasssigalrm',
'gdbserver_tests/nlsigvgdb',
'gdbserver_tests/nlvgdbsigqueue',
'memcheck/tests/linux/memfd_create',
'memcheck/tests/linux/timerfd-syscall',
'memcheck/tests/origin5-bz2',
'massif/tests/mmapunmap']}

So I think we could as well fix these, and add full qemux86 ptest to
a-full? It is not heavy on the builder machine (mostly just runs a
single qemu thread), it's just long.
I think we should fix those and we should add the target to the
autobuilder but I'm reluctant to add a long test to a-full. The fact it
is relatively clean suggests it doesn't regress that often. We could do
something like a once a month trigger for it?

Cheers,

Richard