buildroot/support/download/dl-wrapper
Yann E. MORIN f91e89b6e6 support/download: teach dl-wrapper to handle more than one hash file
Currently, we expect and only use hash files that lie within the package
directory, alongside the .mk file. Those hash files are thus bundled
with Buildroot.

This implies that only what's known to Buildroot can ever get into those
hash files. For packages where the version is fixed (or a static
choice), then we can carry hashes for those known versions.

However, we do have a few packages for which the version is a free-form
entry, where the user can provide a custom location and/or version. like
a custom VCS tree and revision, or a custom tarball URL. This means that
Buildroot has no way to be able to cary hashes for such custom versions.

This means that there is no integrity check that what was downloaded is
what was expected. For a sha1 in a git tree, this is a minor issue,
because the sha1 by itself is already a hash of the expected content.
But for custom tarballs URLs, or for a tag in a VCS, there is indeed no
integrity check.

Buildroot can't provide such hashes, but interested users may want to
provide those, and currently there is no (easy) way to do so.

So, we need our download helpers to be able to accept more than one hash
file to lookup for hashes.

Extend the dl-wrapper and the check-hash helpers thusly, and update the
legal-info accordingly.

Note that, to be able to pass more than one hash file, we also need to
re-order the arguments passed to support/download/check-hash, which also
impies some shuffling in the three places it is called:
  - 2 in dl-wrapper
  - 1 in the legal-info infra

That in turn also requires that the legal-license-file macro args get
re-ordered to have the hash file last; we take the opportunity to also
move the HOST/TARGET arg to be first, like in the other legal-info
macros.

Reported-by: "Martin Zeiser (mzeiser)" <mzeiser@cisco.com>
Signed-off-by: Yann E. MORIN <yann.morin.1998@free.fr>
Cc: Peter Korsgaard <peter@korsgaard.com>
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
2023-11-07 11:48:45 +01:00

234 lines
8.8 KiB
Bash
Executable File

#!/usr/bin/env bash
# This script is a wrapper to the other download backends.
# Its role is to ensure atomicity when saving downloaded files
# back to BR2_DL_DIR, and not clutter BR2_DL_DIR with partial,
# failed downloads.
# To avoid cluttering BR2_DL_DIR, we download to a trashable
# location, namely in $(BUILD_DIR).
# Then, we move the downloaded file to a temporary file in the
# same directory as the final output file.
# This allows us to finally atomically rename it to its final
# name.
# If anything goes wrong, we just remove all the temporaries
# created so far.
# We want to catch any unexpected failure, and exit immediately.
set -e
export BR_BACKEND_DL_GETOPTS=":hc:d:o:n:N:H:lru:qf:e"
main() {
local OPT OPTARG
local backend output large_file recurse quiet rc
local -a uris hfiles
# Parse our options; anything after '--' is for the backend
while getopts ":c:d:D:o:n:N:H:lrf:u:qp:" OPT; do
case "${OPT}" in
c) cset="${OPTARG}";;
d) dl_dir="${OPTARG}";;
D) old_dl_dir="${OPTARG}";;
o) output="${OPTARG}";;
n) raw_base_name="${OPTARG}";;
N) base_name="${OPTARG}";;
H) hfiles+=( "${OPTARG}" );;
l) large_file="-l";;
r) recurse="-r";;
f) filename="${OPTARG}";;
u) uris+=( "${OPTARG}" );;
p) post_process="${OPTARG}";;
q) quiet="-q";;
:) error "option '%s' expects a mandatory argument\n" "${OPTARG}";;
\?) error "unknown option '%s'\n" "${OPTARG}";;
esac
done
# Forget our options, and keep only those for the backend
shift $((OPTIND-1))
if [ -z "${output}" ]; then
error "no output specified, use -o\n"
fi
# Legacy handling: check if the file already exists in the global
# download directory. If it does, hard-link it. If it turns out it
# was an incorrect download, we'd still check it below anyway.
# If we can neither link nor copy, fallback to doing a download.
# NOTE! This is not atomic, is subject to TOCTTOU, but the whole
# dl-wrapper runs under an flock, so we're safe.
if [ ! -e "${output}" -a -e "${old_dl_dir}/${filename}" ]; then
ln "${old_dl_dir}/${filename}" "${output}" || \
cp "${old_dl_dir}/${filename}" "${output}" || \
true
fi
# If the output file already exists and:
# - there's no .hash file: do not download it again and exit promptly
# - matches all its hashes: do not download it again and exit promptly
# - fails at least one of its hashes: force a re-download
# - there's no hash (but a .hash file): consider it a hard error
if [ -e "${output}" ]; then
if support/download/check-hash ${quiet} "${output}" "${output##*/}" "${hfiles[@]}"; then
exit 0
elif [ ${?} -ne 2 ]; then
# Do not remove the file, otherwise it might get re-downloaded
# from a later location (i.e. primary -> upstream -> mirror).
# Do not print a message, check-hash already did.
exit 1
fi
rm -f "${output}"
warn "Re-downloading '%s'...\n" "${output##*/}"
fi
# Look through all the uris that we were given to download the package
# source
download_and_check=0
rc=1
for uri in "${uris[@]}"; do
backend_urlencode="${uri%%+*}"
backend="${backend_urlencode%|*}"
case "${backend}" in
git|svn|cvs|bzr|file|scp|hg|sftp) ;;
*) backend="wget" ;;
esac
uri=${uri#*+}
urlencode=${backend_urlencode#*|}
# urlencode must be "urlencode"
[ "${urlencode}" != "urlencode" ] && urlencode=""
# tmpd is a temporary directory in which backends may store
# intermediate by-products of the download.
# tmpf is the file in which the backends should put the downloaded
# content.
# tmpd is located in $(BUILD_DIR), so as not to clutter the (precious)
# $(BR2_DL_DIR)
# We let the backends create tmpf, so they are able to set whatever
# permission bits they want (although we're only really interested in
# the executable bit.)
tmpd="$(mktemp -d "${BUILD_DIR}/.${output##*/}.XXXXXX")"
tmpf="${tmpd}/output"
# Helpers expect to run in a directory that is *really* trashable, so
# they are free to create whatever files and/or sub-dirs they might need.
# Doing the 'cd' here rather than in all backends is easier.
cd "${tmpd}"
# If the backend fails, we can just remove the content of the temporary
# directory to remove all the cruft it may have left behind, and try
# the next URI until it succeeds. Once out of URI to try, we need to
# cleanup and exit.
if ! "${OLDPWD}/support/download/${backend}" \
$([ -n "${urlencode}" ] && printf %s '-e') \
-c "${cset}" \
-d "${dl_dir}" \
-n "${raw_base_name}" \
-N "${base_name}" \
-f "${filename}" \
-u "${uri}" \
-o "${tmpf}" \
${quiet} ${large_file} ${recurse} -- "${@}"
then
# cd back to keep path coherence
cd "${OLDPWD}"
rm -rf "${tmpd}"
continue
fi
if [ -n "${post_process}" ] ; then
if ! "${OLDPWD}/support/download/${post_process}-post-process" \
-o "${tmpf}" \
-n "${raw_base_name}"
then
# cd back to keep path coherence
cd "${OLDPWD}"
rm -rf "${tmpd}"
continue
fi
fi
# cd back to free the temp-dir, so we can remove it later
cd "${OLDPWD}"
# Check if the downloaded file is sane, and matches the stored hashes
# for that file
if support/download/check-hash ${quiet} "${tmpf}" "${output##*/}" "${hfiles[@]}"; then
rc=0
else
if [ ${?} -ne 3 ]; then
rm -rf "${tmpd}"
continue
fi
# the hash file exists and there was no hash to check the file
# against
rc=1
fi
download_and_check=1
break
done
# We tried every URI possible, none seems to work or to check against the
# available hash. *ABORT MISSION*
if [ "${download_and_check}" -eq 0 ]; then
rm -rf "${tmpd}"
exit 1
fi
# tmp_output is in the same directory as the final output, so we can
# later move it atomically.
tmp_output="$(mktemp "${output}.XXXXXX")"
# 'mktemp' creates files with 'go=-rwx', so the files are not accessible
# to users other than the one doing the download (and root, of course).
# This can be problematic when a shared BR2_DL_DIR is used by different
# users (e.g. on a build server), where all users may write to the shared
# location, since other users would not be allowed to read the files
# another user downloaded.
# So, we restore the 'go' access rights to a more sensible value, while
# still abiding by the current user's umask. We must do that before the
# final 'mv', so just do it now.
# Some backends (cp and scp) may create executable files, so we need to
# carry the executable bit if needed.
[ -x "${tmpf}" ] && new_mode=755 || new_mode=644
new_mode=$(printf "%04o" $((0${new_mode} & ~0$(umask))))
chmod ${new_mode} "${tmp_output}"
# We must *not* unlink tmp_output, otherwise there is a small window
# during which another download process may create the same tmp_output
# name (very, very unlikely; but not impossible.)
# Using 'cp' is not reliable, since 'cp' may unlink the destination file
# if it is unable to open it with O_WRONLY|O_TRUNC; see:
# http://pubs.opengroup.org/onlinepubs/9699919799/utilities/cp.html
# Since the destination filesystem can be anything, it might not support
# O_TRUNC, so 'cp' would unlink it first.
# Use 'cat' and append-redirection '>>' to save to the final location,
# since that is the only way we can be 100% sure of the behaviour.
if ! cat "${tmpf}" >>"${tmp_output}"; then
rm -rf "${tmpd}" "${tmp_output}"
exit 1
fi
rm -rf "${tmpd}"
# tmp_output and output are on the same filesystem, so POSIX guarantees
# that 'mv' is atomic, because it then uses rename() that POSIX mandates
# to be atomic, see:
# http://pubs.opengroup.org/onlinepubs/9699919799/functions/rename.html
if ! mv -f "${tmp_output}" "${output}"; then
rm -f "${tmp_output}"
exit 1
fi
return ${rc}
}
trace() { local msg="${1}"; shift; printf "%s: ${msg}" "${my_name}" "${@}"; }
warn() { trace "${@}" >&2; }
errorN() { local ret="${1}"; shift; warn "${@}"; exit ${ret}; }
error() { errorN 1 "${@}"; }
my_name="${0##*/}"
main "${@}"