mirror of
https://github.com/fosslinux/live-bootstrap.git
synced 2026-03-23 11:36:32 +01:00
refactor+docs(payload.img, payload.img discovery): split offline distfiles at improve: import_payload so main image is minimal and payload.img is primary carrier, detect payload.img automaticly using magic number
This commit is contained in:
parent
e20afe69bb
commit
beb9fb12f9
5 changed files with 234 additions and 26 deletions
102
Payload_img_design.md
Normal file
102
Payload_img_design.md
Normal file
|
|
@ -0,0 +1,102 @@
|
|||
# live-bootstrap
|
||||
|
||||
This repository uses [`README.rst`](./README.rst) as the canonical main documentation.
|
||||
|
||||
## Kernel-bootstrap `payload.img`
|
||||
|
||||
`payload.img` is a raw container disk used in kernel-bootstrap offline mode
|
||||
(`--repo` and `--external-sources` are both unset).
|
||||
|
||||
### Why not put everything in the initial image?
|
||||
|
||||
In kernel-bootstrap mode, the first boot image is consumed by very early
|
||||
runtime code before the system reaches the normal bash-based build stage.
|
||||
That early stage has tight assumptions about memory layout and file table usage.
|
||||
|
||||
When too many distfiles are packed into the initial image, those assumptions can
|
||||
be exceeded, which leads to unstable handoff behavior (for example, failures
|
||||
around the Fiwix transition in QEMU or on bare metal).
|
||||
|
||||
So the design is intentionally split:
|
||||
|
||||
- Initial image: only what is required to reach `improve: import_payload`
|
||||
- `payload.img`: the rest of offline distfiles
|
||||
|
||||
This is not a patch-style workaround. It is a two-phase transport design that
|
||||
keeps early boot deterministic and moves bulk data import to a stage where the
|
||||
runtime is robust enough to process it safely.
|
||||
|
||||
### Why import from an external image and copy into main filesystem?
|
||||
|
||||
Because the bootstrap still expects distfiles to end up under the normal local
|
||||
path (`/external/distfiles`) for later steps. `payload.img` is used as a
|
||||
transport medium only.
|
||||
|
||||
The flow is:
|
||||
|
||||
1. Boot minimal initial image.
|
||||
2. Reach `improve: import_payload`.
|
||||
3. Detect the payload disk by magic (`LBPAYLD1`) across detected block devices.
|
||||
4. Copy payload files into `/external/distfiles`.
|
||||
5. Continue the build exactly as if files had been present locally all along.
|
||||
|
||||
### Format
|
||||
|
||||
- Magic: `LBPAYLD1` (8 bytes)
|
||||
- Then: little-endian `u32` file count
|
||||
- Repeated entries:
|
||||
- little-endian `u32` name length
|
||||
- little-endian `u32` file size
|
||||
- file name bytes (no terminator)
|
||||
- file bytes
|
||||
|
||||
The importer probes detected block devices and selects the one with magic `LBPAYLD1`.
|
||||
|
||||
### Manual creation without Python
|
||||
|
||||
Prepare `payload.list` as:
|
||||
|
||||
```text
|
||||
<archive-name> <absolute-path-to-archive>
|
||||
```
|
||||
|
||||
Then:
|
||||
|
||||
```sh
|
||||
cat > make-payload.sh <<'SH'
|
||||
#!/bin/sh
|
||||
set -e
|
||||
out="${1:-payload.img}"
|
||||
list="${2:-payload.list}"
|
||||
|
||||
write_u32le() {
|
||||
v="$1"
|
||||
printf '%08x' "$v" | sed -E 's/(..)(..)(..)(..)/\4\3\2\1/' | xxd -r -p
|
||||
}
|
||||
|
||||
count="$(wc -l < "${list}" | tr -d ' ')"
|
||||
: > "${out}"
|
||||
printf 'LBPAYLD1' >> "${out}"
|
||||
write_u32le "${count}" >> "${out}"
|
||||
|
||||
while read -r name path; do
|
||||
[ -n "${name}" ] || continue
|
||||
size="$(wc -c < "${path}" | tr -d ' ')"
|
||||
write_u32le "${#name}" >> "${out}"
|
||||
write_u32le "${size}" >> "${out}"
|
||||
printf '%s' "${name}" >> "${out}"
|
||||
cat "${path}" >> "${out}"
|
||||
done < "${list}"
|
||||
SH
|
||||
chmod +x make-payload.sh
|
||||
./make-payload.sh payload.img payload.list
|
||||
```
|
||||
|
||||
Attach `payload.img` as an extra raw disk in QEMU, or as the second disk on bare metal.
|
||||
|
||||
### When it is used
|
||||
|
||||
- Used in kernel-bootstrap offline mode.
|
||||
- Not used when `--repo` or `--external-sources` is provided.
|
||||
- `--build-guix-also` increases payload contents (includes post-early `steps-guix`
|
||||
sources), but does not change the mechanism.
|
||||
73
README.rst
73
README.rst
|
|
@ -63,17 +63,78 @@ Without using Python:
|
|||
|
||||
* *Only* copy distfiles listed in ``sources`` files for ``build:`` steps
|
||||
manifested before ``improve: get_network`` into this disk.
|
||||
* Optionally (if you don't do this, distfiles will be network downloaded):
|
||||
* In kernel-bootstrap offline mode (no ``--repo`` and no
|
||||
``--external-sources``), use the second image as ``payload.img``.
|
||||
``payload.img`` is a raw container (not a filesystem) used to carry the
|
||||
distfiles that are not needed before ``improve: import_payload``.
|
||||
In other words, the first image only carries the minimal set needed to
|
||||
reach the importer; the rest of the offline distfiles live in payload.
|
||||
|
||||
* On the second image, create an MSDOS partition table and one ext3
|
||||
partition.
|
||||
* Copy ``distfiles/`` into this disk.
|
||||
* Run QEMU, with 4+G RAM, optionally SMP (multicore), both drives (in the
|
||||
order introduced above), a NIC with model E1000
|
||||
* Header magic: ``LBPAYLD1`` (8 bytes).
|
||||
* Then: little-endian ``u32`` file count.
|
||||
* Repeated for each file: little-endian ``u32`` name length,
|
||||
little-endian ``u32`` file size, raw file name bytes, raw file bytes.
|
||||
|
||||
* If you are not in that mode, the second disk can still be used as an
|
||||
optional ext3 distfiles disk, as before.
|
||||
* Run QEMU, with 4+G RAM, optionally SMP (multicore), both drives (main
|
||||
builder image plus payload/ext3 image), a NIC with model E1000
|
||||
(``-nic user,model=e1000``), and ``-machine kernel-irqchip=split``.
|
||||
c. **Bare metal:** Follow the same steps as QEMU, but the disks need to be
|
||||
two different *physical* disks, and boot from the first disk.
|
||||
|
||||
Manual ``payload.img`` preparation
|
||||
----------------------------------
|
||||
|
||||
The following script creates a raw ``payload.img`` from a manually prepared
|
||||
file list. This is equivalent to what ``rootfs.py`` does for kernel-bootstrap
|
||||
offline mode.
|
||||
|
||||
1. Prepare a ``payload.list`` with one file per line, formatted as:
|
||||
``<archive-name> <absolute-path-to-archive>``.
|
||||
2. Run:
|
||||
|
||||
::
|
||||
|
||||
cat > make-payload.sh <<'EOF'
|
||||
#!/bin/sh
|
||||
set -e
|
||||
out="${1:-payload.img}"
|
||||
list="${2:-payload.list}"
|
||||
|
||||
write_u32le() {
|
||||
v="$1"
|
||||
printf '%08x' "$v" | sed -E 's/(..)(..)(..)(..)/\4\3\2\1/' | xxd -r -p
|
||||
}
|
||||
|
||||
count="$(wc -l < "${list}" | tr -d ' ')"
|
||||
: > "${out}"
|
||||
printf 'LBPAYLD1' >> "${out}"
|
||||
write_u32le "${count}" >> "${out}"
|
||||
|
||||
while read -r name path; do
|
||||
[ -n "${name}" ] || continue
|
||||
size="$(wc -c < "${path}" | tr -d ' ')"
|
||||
write_u32le "${#name}" >> "${out}"
|
||||
write_u32le "${size}" >> "${out}"
|
||||
printf '%s' "${name}" >> "${out}"
|
||||
cat "${path}" >> "${out}"
|
||||
done < "${list}"
|
||||
EOF
|
||||
chmod +x make-payload.sh
|
||||
./make-payload.sh payload.img payload.list
|
||||
|
||||
3. Attach ``payload.img`` as an additional raw disk when booting in QEMU, or
|
||||
as the second physical disk on bare metal.
|
||||
|
||||
Notes:
|
||||
|
||||
* ``payload.img`` is used in kernel-bootstrap offline mode regardless of
|
||||
``--build-guix-also``. With ``--build-guix-also``, the payload content is
|
||||
larger because it also includes post-early sources from ``steps-guix``.
|
||||
* The runtime importer identifies the correct disk by checking the magic
|
||||
``LBPAYLD1`` on each detected block device, not by assuming a device name.
|
||||
|
||||
Mirrors
|
||||
-------
|
||||
|
||||
|
|
|
|||
|
|
@ -37,10 +37,14 @@ class Generator():
|
|||
self.repo_path = repo_path
|
||||
self.mirrors = mirrors
|
||||
self.build_guix_also = build_guix_also
|
||||
self.source_manifest = self.get_source_manifest(not self.external_sources,
|
||||
build_guix_also=self.build_guix_also)
|
||||
self.early_source_manifest = self.get_source_manifest(True,
|
||||
build_guix_also=self.build_guix_also)
|
||||
self.source_manifest = self.get_source_manifest(
|
||||
stop_before_improve=("get_network" if not self.external_sources else None),
|
||||
build_guix_also=self.build_guix_also
|
||||
)
|
||||
self.early_source_manifest = self.get_source_manifest(
|
||||
stop_before_improve="get_network",
|
||||
build_guix_also=self.build_guix_also
|
||||
)
|
||||
self.bootstrap_source_manifest = self.source_manifest
|
||||
self.payload_source_manifest = []
|
||||
self.payload_image = None
|
||||
|
|
@ -59,11 +63,17 @@ class Generator():
|
|||
"""
|
||||
Split early source payload from full offline payload.
|
||||
"""
|
||||
# Keep the early builder payload small enough to avoid overrunning
|
||||
# builder-hex0 memory file allocation before we can jump into Fiwix.
|
||||
self.bootstrap_source_manifest = self.get_source_manifest(True, build_guix_also=False)
|
||||
# Keep the early builder payload small: include only sources needed
|
||||
# before improve: import_payload runs, so payload.img is the primary
|
||||
# carrier for the rest of the offline distfiles.
|
||||
self.bootstrap_source_manifest = self.get_source_manifest(
|
||||
stop_before_improve="import_payload",
|
||||
build_guix_also=False
|
||||
)
|
||||
|
||||
full_manifest = self.get_source_manifest(False, build_guix_also=self.build_guix_also)
|
||||
full_manifest = self.get_source_manifest(build_guix_also=self.build_guix_also)
|
||||
if self.bootstrap_source_manifest == full_manifest:
|
||||
raise ValueError("steps/manifest must include `improve: import_payload` in kernel-bootstrap mode.")
|
||||
bootstrap_set = set(self.bootstrap_source_manifest)
|
||||
self.payload_source_manifest = [entry for entry in full_manifest if entry not in bootstrap_set]
|
||||
|
||||
|
|
@ -83,11 +93,12 @@ class Generator():
|
|||
self.check_file(distfile_path, checksum)
|
||||
|
||||
def _create_raw_payload_image(self, target_path, manifest):
|
||||
if not manifest:
|
||||
return None
|
||||
if manifest is None:
|
||||
manifest = []
|
||||
|
||||
# Guarantee all payload distfiles exist and match checksums.
|
||||
self._ensure_manifest_distfiles(manifest)
|
||||
if manifest:
|
||||
# Guarantee all payload distfiles exist and match checksums.
|
||||
self._ensure_manifest_distfiles(manifest)
|
||||
|
||||
files_by_name = {}
|
||||
for checksum, _, _, file_name in manifest:
|
||||
|
|
@ -180,7 +191,9 @@ class Generator():
|
|||
if self.repo_path or self.external_sources:
|
||||
mkfs_args = ['-d', os.path.join(target.path, 'external')]
|
||||
target.add_disk("external", filesystem="ext3", mkfs_args=mkfs_args)
|
||||
elif self.payload_source_manifest:
|
||||
else:
|
||||
# Offline kernel-bootstrap mode keeps the early image small and
|
||||
# puts remaining distfiles in payload.img.
|
||||
self.payload_image = self._create_raw_payload_image(target.path, self.payload_source_manifest)
|
||||
target.add_existing_disk("payload", self.payload_image)
|
||||
elif using_kernel:
|
||||
|
|
@ -420,7 +433,7 @@ this script the next time")
|
|||
self.check_file(path, line[0])
|
||||
|
||||
@classmethod
|
||||
def get_source_manifest(cls, pre_network=False, build_guix_also=False):
|
||||
def get_source_manifest(cls, stop_before_improve=None, build_guix_also=False):
|
||||
"""
|
||||
Generate a source manifest for the system.
|
||||
"""
|
||||
|
|
@ -443,10 +456,13 @@ this script the next time")
|
|||
|
||||
with open(manifest_path, 'r', encoding="utf_8") as file:
|
||||
for line in file:
|
||||
if pre_network and line.strip().startswith("improve: ") and "network" in line:
|
||||
break
|
||||
stripped = line.strip()
|
||||
if stop_before_improve and stripped.startswith("improve: "):
|
||||
improve_step = stripped.split(" ")[1].split("#")[0].strip()
|
||||
if improve_step == stop_before_improve:
|
||||
break
|
||||
|
||||
if not line.strip().startswith("build: "):
|
||||
if not stripped.startswith("build: "):
|
||||
continue
|
||||
|
||||
step = line.split(" ")[1].split("#")[0].strip()
|
||||
|
|
|
|||
|
|
@ -71,12 +71,14 @@ with open(config_path, "r", encoding="utf-8") as cfg:
|
|||
if not line.startswith("BUILD_GUIX_ALSO=")
|
||||
and not line.startswith("MIRRORS=")
|
||||
and not line.startswith("MIRRORS_LEN=")
|
||||
and not line.startswith("PAYLOAD_REQUIRED=")
|
||||
]
|
||||
if build_guix_also:
|
||||
lines.append("BUILD_GUIX_ALSO=True\\n")
|
||||
if mirrors:
|
||||
lines.append(f'MIRRORS="{" ".join(mirrors)}"\\n')
|
||||
lines.append(f"MIRRORS_LEN={len(mirrors)}\\n")
|
||||
lines.append("PAYLOAD_REQUIRED=False\\n")
|
||||
with open(config_path, "w", encoding="utf-8") as cfg:
|
||||
cfg.writelines(lines)
|
||||
|
||||
|
|
@ -470,9 +472,10 @@ print(shutil.which('chroot'))
|
|||
arg_list += [
|
||||
'-drive', 'file=' + target.get_disk("external") + ',format=raw',
|
||||
]
|
||||
if target.get_disk("payload") is not None:
|
||||
payload_disk = target.get_disk("payload")
|
||||
if payload_disk is not None:
|
||||
arg_list += [
|
||||
'-drive', 'file=' + target.get_disk("payload") + ',format=raw',
|
||||
'-drive', 'file=' + payload_disk + ',format=raw',
|
||||
]
|
||||
arg_list += [
|
||||
'-machine', 'kernel-irqchip=split',
|
||||
|
|
|
|||
|
|
@ -7,5 +7,31 @@ set -ex
|
|||
|
||||
if [ "${PAYLOAD_REQUIRED}" = True ]; then
|
||||
mkdir -p /external/distfiles
|
||||
payload-import /external/distfiles
|
||||
found_payload=0
|
||||
|
||||
# Probe all block devices reported by the running kernel instead of
|
||||
# assuming fixed names like /dev/sdb or /dev/hdb.
|
||||
while read -r major minor blocks name; do
|
||||
case "${name}" in
|
||||
""|name|ram*|loop*|fd*|sr*|md*)
|
||||
continue
|
||||
;;
|
||||
esac
|
||||
|
||||
dev_path="/dev/${name}"
|
||||
if [ ! -b "${dev_path}" ]; then
|
||||
mknod -m 600 "${dev_path}" b "${major}" "${minor}" || :
|
||||
fi
|
||||
|
||||
if payload-import --probe "${dev_path}"; then
|
||||
payload-import --device "${dev_path}" /external/distfiles
|
||||
found_payload=1
|
||||
break
|
||||
fi
|
||||
done < /proc/partitions
|
||||
|
||||
if [ "${found_payload}" != 1 ]; then
|
||||
echo "payload-import failed: no payload image found on detected block devices." >&2
|
||||
exit 1
|
||||
fi
|
||||
fi
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue