Contrary to what the original commit said, this is actually still used
(see .gitlab-ci/bare-metal/poe-powered.sh:205), and the boot retry logic
has been broken ever since, exacerbating the rpi farm boot problems.
Fixes: 97b2afa16a ("ci/bare-metal: Drop the 2 vs 1 exit code from poe_run.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30340>
In order to turn on/off through SNMP DuT under PoE switch, the SNMP key
in some vendors don't directly use the interface number, but a number
shifted a base number.
Define this base number as BM_POE_BASE environment in the runner.
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29306>
Use the CustomLogger class and CLI tool to create strutured logs
for poe scripts which are used by broadcom and nouveau jobs.
Renamed stage lint to code-validation and added python-test job
which runs the tests for structured and customer logger to ci.
Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25179>
In some cases we can have the config.txt boot file already available in
the tftp folder.
Reviewed-by: Eric Engestrom <eric@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26552>
We need update kernel often. We need test kernel changes often.
Introduced `KERNEL_EXTERNAL_TAG` to differ between `KERNEL_TAG` which is
also used to rebuild the containers. We don't need rebuild containers
for the external kernel, so this way we don't have to.
Updating kernel goes wruuuuuum.
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23563>
We have job-level retry on failure now, and will continue to need to in
order to work around fd.o infrastructure flakes. If we stop doing retry
inside the job, then we can crank down the gitlab-level timeouts on test
jobs to be closer to our CI guidelines and avoid blocking a runner for an
hour when things go wrong (for example, cheza #16 failing to boot in a
recognized way and continuously looping due to the intra-job retry).
Plus, the job logs will be more readable when you don't have two boots in
one job, and we'll get the flakes surfaced in our monitoring dashboards.
If internal retries were really doing useful work we may see an increase
in flakes as a result of this. I'm committing to turning off boards or
reducing coverage as necessary to handle this.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25790>
1. Move block used for detecting R8152 problems to the bootloader
phase where it belongs. Also remove requirement to 100 failures and just
retry immediatelly.
2. Consider job failed after 10 errors, not 100. From the logs on
cheza-14, ~ 30 errors is enough to fail.
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25285>
We don't need the path, not at all when we use external kernel.
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24646>
curl is already installed in these images, drop it.
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24646>
10 is overkill, if we fail that many times in a row we should stop
trying on this runner.
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24407>
Modify the build process for the images to include the build to have ci-kdl
available in the Mesa jobs. Modify also the init-stage2 to launch in the
background the process that will collect data and store a json file with the
relative changes on the recorded data.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24177>
We're only going to read it, not execute it.
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22945>
Better error handling is more reliable.
Options:
-L, follow location
--retry, number of retries
--retry-all-errors, does not fail on ALL errors, that's why there is -f
-f, fail fast with no output at all on server errors
--retry-delay, make curl sleep this amount of time before each retry
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20788>
Now that we assign reset blame appropriately, they're safe to run
together, and no single-threading. I put these in a .toml because I'm
about to add another window system.
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19912>
Make it possible to provide per device mkimage.py params.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13691>
This is a farm of 5 (6, but one fails) TK1 boards for nouveau testing,
hosted and maintained by me. Currently it runs GLES dEQP.
I've been using ./.gitlab-ci/bin/ci_run_n_monitor.py --stress --target
gk20a to test it and am pretty confident of the skips/flakes list. Last
night it ran 318 jobs without fail, and prior to that there were two sets
of runs in the 100-200 range where only the one failing runner failed any
jobs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18497>
If we got a "Reached the end of the CPU serial log without finding a
result" because the test phase timed out, then the CPU serial would have
been closed as part of the timeout process, so we need to close the rest
and re-instantiate the servo run class.
fastboot and poe already re-instantiate the class on retry.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17689>
It seems that we sometimes stall out executing "fastboot boot", and if
that happens we want to reboot the board and try again.
Fixes: #6682
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17607>
This should help with "marge got stuck for an hour and all I got was this
failed job with no results/" when a system intermittently wedges.
This replaces the BM_POE_TIMEOUT ("did we get something on serial in the
last 3 minutes?") that rpi had, in favor of checking that the whole test
job gets through in 20 minutes.
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17096>
My local trogdor has a netboot firmware and I want to be able to use it to
test the timeout code I'm working on.
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17096>
If the SerialBuffers can just feed the same line queue, then we don't need
the extra threads reading line queues into a new merged line queue.
Less python threading code is always better. Plus, now we can pass args
to SerialBuffer.lines() for timeout/phase.
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17096>
This should avoid the 1-hour timeouts if something goes wrong, and just
restart.
Fixes: #6682
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17096>
My editor likes to pep8 as I edit, and I'm tired of carefully not
committing those changes.
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17096>
The script will be used for tuning Intel GPU frequency to maximize
performance tests execution, while also trying to reduce throttling,
which has a negative impact on results consistency.
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Acked-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15662>
The test suite is full of flakes around transform feedback, atomics, and
tess. But, I hope it can be useful for regression testing core Mesa
reworks.
This required updating the kernel to 5.16.12 to get a more stable boot
process. That kernel rebuild caused an update of the container with
piglit which that was missed in a previous MR, so we got new xfails in x86
swrast.
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> (nouveau)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15201>
This required updating the kernel to 5.16.12 to get a more stable boot
process. That kernel rebuild caused an update of the container with
piglit which that was missed in a previous MR, so we got new xfails in x86
swrast. Also, including modules on arm64 exposed a bug in v3d's
poe-powered.sh rsyncing of modules.
Acked-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15201>
The manual jetson CI job I'm introducing has serious boot reliability
trouble, but also we've seen frequent intermittent failures on bcm where
at least 2 boots don't seem to be enough (#6041).
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15201>
But don't bail immediately, instead print out some more lines after the
hang, hopefully catching info about the cause of the hang.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14033>
So instead cancel the read first, and then close. Make sure the
serial-reading properly detects this cancelled condition under all
circumstances and exits.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14033>
For every CI job, put JWT content into a file and unset CI_JOB_JWT
environment var
=======
* virgl jobs:
- Share JWT token file to crosvm instance
- Keep using `export -p` due to high complexity in the scripts
of these jobs. At least, the CI_JOB_JWT will not be leaked,
since it is being unset at the `before_script` phase of each
Mesa CI job.
* iris jobs: Update lava_job_submitter to take token file as argument
- generate-env with CI_JOB_JWT_TOKEN_FILE
- create token file during baremetal init stage
* baremetal jobs: Copy token file to bare-metal NFS
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Reviewed-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14004>
We can use the general "how parallel should we go on this runner?" env var
and save a bunch of massaging env var names. Fixes how PIGLIT_PARALLEL
looked like it was useful but actually wasn't passed through to HW
runners.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13372>