ci/bare-metal: Try rebooting chezas again if they get stuck during tftp.

Occasionally something goes weird in the network and a group of chezas will produce streams of these errors during the tftp process, eventually timing out after 60 minutes in the job. By the time we notice, the next jobs seem to go through fine, so watch for them and try rebooting the cheza to see if that gets our jobs to pass again. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6398>
2024-12-05 08:14:57 +08:00 · 2020-08-19 11:41:51 -07:00 · 2020-08-19 11:41:51 -07:00 · 2da1178bf3
commit 2da1178bf3
parent c27075e9e1
1 changed files with 10 additions and 0 deletions
--- a/.gitlab-ci/bare-metal/cros_servo_run.py
+++ b/.gitlab-ci/bare-metal/cros_servo_run.py
@ -52,6 +52,7 @@ class CrosServoRun:
                self.cpu_write("\016")
                break

+        tftp_failures = 0
        for line in self.cpu_ser.lines():
            if re.match("---. end Kernel panic", line):
                return 1
@ -62,6 +63,15 @@ class CrosServoRun:
            if re.match("POWER_GOOD not seen in time", line):
                return 2

+            # The Cheza firmware seems to occasionally get stuck looping in
+            # this error state during TFTP booting, possibly based on amount of
+            # network traffic around it, but it'll usually recover after a
+            # reboot.
+            if re.match("R8152: Bulk read error 0xffffffbf", line):
+                tftp_failures += 1
+                if tftp_failures >= 100:
+                    return 2
+
            result = re.match("bare-metal result: (\S*)", line)
            if result:
                if result.group(1) == "pass":