mirror of
https://github.com/fosslinux/live-bootstrap.git
synced 2026-03-23 19:46:31 +01:00
refactor+docs(payload.img, payload.img discovery): split offline distfiles at improve: import_payload so main image is minimal and payload.img is primary carrier, detect payload.img automaticly using magic number
This commit is contained in:
parent
e20afe69bb
commit
beb9fb12f9
5 changed files with 234 additions and 26 deletions
102
Payload_img_design.md
Normal file
102
Payload_img_design.md
Normal file
|
|
@ -0,0 +1,102 @@
|
||||||
|
# live-bootstrap
|
||||||
|
|
||||||
|
This repository uses [`README.rst`](./README.rst) as the canonical main documentation.
|
||||||
|
|
||||||
|
## Kernel-bootstrap `payload.img`
|
||||||
|
|
||||||
|
`payload.img` is a raw container disk used in kernel-bootstrap offline mode
|
||||||
|
(`--repo` and `--external-sources` are both unset).
|
||||||
|
|
||||||
|
### Why not put everything in the initial image?
|
||||||
|
|
||||||
|
In kernel-bootstrap mode, the first boot image is consumed by very early
|
||||||
|
runtime code before the system reaches the normal bash-based build stage.
|
||||||
|
That early stage has tight assumptions about memory layout and file table usage.
|
||||||
|
|
||||||
|
When too many distfiles are packed into the initial image, those assumptions can
|
||||||
|
be exceeded, which leads to unstable handoff behavior (for example, failures
|
||||||
|
around the Fiwix transition in QEMU or on bare metal).
|
||||||
|
|
||||||
|
So the design is intentionally split:
|
||||||
|
|
||||||
|
- Initial image: only what is required to reach `improve: import_payload`
|
||||||
|
- `payload.img`: the rest of offline distfiles
|
||||||
|
|
||||||
|
This is not a patch-style workaround. It is a two-phase transport design that
|
||||||
|
keeps early boot deterministic and moves bulk data import to a stage where the
|
||||||
|
runtime is robust enough to process it safely.
|
||||||
|
|
||||||
|
### Why import from an external image and copy into main filesystem?
|
||||||
|
|
||||||
|
Because the bootstrap still expects distfiles to end up under the normal local
|
||||||
|
path (`/external/distfiles`) for later steps. `payload.img` is used as a
|
||||||
|
transport medium only.
|
||||||
|
|
||||||
|
The flow is:
|
||||||
|
|
||||||
|
1. Boot minimal initial image.
|
||||||
|
2. Reach `improve: import_payload`.
|
||||||
|
3. Detect the payload disk by magic (`LBPAYLD1`) across detected block devices.
|
||||||
|
4. Copy payload files into `/external/distfiles`.
|
||||||
|
5. Continue the build exactly as if files had been present locally all along.
|
||||||
|
|
||||||
|
### Format
|
||||||
|
|
||||||
|
- Magic: `LBPAYLD1` (8 bytes)
|
||||||
|
- Then: little-endian `u32` file count
|
||||||
|
- Repeated entries:
|
||||||
|
- little-endian `u32` name length
|
||||||
|
- little-endian `u32` file size
|
||||||
|
- file name bytes (no terminator)
|
||||||
|
- file bytes
|
||||||
|
|
||||||
|
The importer probes detected block devices and selects the one with magic `LBPAYLD1`.
|
||||||
|
|
||||||
|
### Manual creation without Python
|
||||||
|
|
||||||
|
Prepare `payload.list` as:
|
||||||
|
|
||||||
|
```text
|
||||||
|
<archive-name> <absolute-path-to-archive>
|
||||||
|
```
|
||||||
|
|
||||||
|
Then:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
cat > make-payload.sh <<'SH'
|
||||||
|
#!/bin/sh
|
||||||
|
set -e
|
||||||
|
out="${1:-payload.img}"
|
||||||
|
list="${2:-payload.list}"
|
||||||
|
|
||||||
|
write_u32le() {
|
||||||
|
v="$1"
|
||||||
|
printf '%08x' "$v" | sed -E 's/(..)(..)(..)(..)/\4\3\2\1/' | xxd -r -p
|
||||||
|
}
|
||||||
|
|
||||||
|
count="$(wc -l < "${list}" | tr -d ' ')"
|
||||||
|
: > "${out}"
|
||||||
|
printf 'LBPAYLD1' >> "${out}"
|
||||||
|
write_u32le "${count}" >> "${out}"
|
||||||
|
|
||||||
|
while read -r name path; do
|
||||||
|
[ -n "${name}" ] || continue
|
||||||
|
size="$(wc -c < "${path}" | tr -d ' ')"
|
||||||
|
write_u32le "${#name}" >> "${out}"
|
||||||
|
write_u32le "${size}" >> "${out}"
|
||||||
|
printf '%s' "${name}" >> "${out}"
|
||||||
|
cat "${path}" >> "${out}"
|
||||||
|
done < "${list}"
|
||||||
|
SH
|
||||||
|
chmod +x make-payload.sh
|
||||||
|
./make-payload.sh payload.img payload.list
|
||||||
|
```
|
||||||
|
|
||||||
|
Attach `payload.img` as an extra raw disk in QEMU, or as the second disk on bare metal.
|
||||||
|
|
||||||
|
### When it is used
|
||||||
|
|
||||||
|
- Used in kernel-bootstrap offline mode.
|
||||||
|
- Not used when `--repo` or `--external-sources` is provided.
|
||||||
|
- `--build-guix-also` increases payload contents (includes post-early `steps-guix`
|
||||||
|
sources), but does not change the mechanism.
|
||||||
73
README.rst
73
README.rst
|
|
@ -63,17 +63,78 @@ Without using Python:
|
||||||
|
|
||||||
* *Only* copy distfiles listed in ``sources`` files for ``build:`` steps
|
* *Only* copy distfiles listed in ``sources`` files for ``build:`` steps
|
||||||
manifested before ``improve: get_network`` into this disk.
|
manifested before ``improve: get_network`` into this disk.
|
||||||
* Optionally (if you don't do this, distfiles will be network downloaded):
|
* In kernel-bootstrap offline mode (no ``--repo`` and no
|
||||||
|
``--external-sources``), use the second image as ``payload.img``.
|
||||||
|
``payload.img`` is a raw container (not a filesystem) used to carry the
|
||||||
|
distfiles that are not needed before ``improve: import_payload``.
|
||||||
|
In other words, the first image only carries the minimal set needed to
|
||||||
|
reach the importer; the rest of the offline distfiles live in payload.
|
||||||
|
|
||||||
* On the second image, create an MSDOS partition table and one ext3
|
* Header magic: ``LBPAYLD1`` (8 bytes).
|
||||||
partition.
|
* Then: little-endian ``u32`` file count.
|
||||||
* Copy ``distfiles/`` into this disk.
|
* Repeated for each file: little-endian ``u32`` name length,
|
||||||
* Run QEMU, with 4+G RAM, optionally SMP (multicore), both drives (in the
|
little-endian ``u32`` file size, raw file name bytes, raw file bytes.
|
||||||
order introduced above), a NIC with model E1000
|
|
||||||
|
* If you are not in that mode, the second disk can still be used as an
|
||||||
|
optional ext3 distfiles disk, as before.
|
||||||
|
* Run QEMU, with 4+G RAM, optionally SMP (multicore), both drives (main
|
||||||
|
builder image plus payload/ext3 image), a NIC with model E1000
|
||||||
(``-nic user,model=e1000``), and ``-machine kernel-irqchip=split``.
|
(``-nic user,model=e1000``), and ``-machine kernel-irqchip=split``.
|
||||||
c. **Bare metal:** Follow the same steps as QEMU, but the disks need to be
|
c. **Bare metal:** Follow the same steps as QEMU, but the disks need to be
|
||||||
two different *physical* disks, and boot from the first disk.
|
two different *physical* disks, and boot from the first disk.
|
||||||
|
|
||||||
|
Manual ``payload.img`` preparation
|
||||||
|
----------------------------------
|
||||||
|
|
||||||
|
The following script creates a raw ``payload.img`` from a manually prepared
|
||||||
|
file list. This is equivalent to what ``rootfs.py`` does for kernel-bootstrap
|
||||||
|
offline mode.
|
||||||
|
|
||||||
|
1. Prepare a ``payload.list`` with one file per line, formatted as:
|
||||||
|
``<archive-name> <absolute-path-to-archive>``.
|
||||||
|
2. Run:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
cat > make-payload.sh <<'EOF'
|
||||||
|
#!/bin/sh
|
||||||
|
set -e
|
||||||
|
out="${1:-payload.img}"
|
||||||
|
list="${2:-payload.list}"
|
||||||
|
|
||||||
|
write_u32le() {
|
||||||
|
v="$1"
|
||||||
|
printf '%08x' "$v" | sed -E 's/(..)(..)(..)(..)/\4\3\2\1/' | xxd -r -p
|
||||||
|
}
|
||||||
|
|
||||||
|
count="$(wc -l < "${list}" | tr -d ' ')"
|
||||||
|
: > "${out}"
|
||||||
|
printf 'LBPAYLD1' >> "${out}"
|
||||||
|
write_u32le "${count}" >> "${out}"
|
||||||
|
|
||||||
|
while read -r name path; do
|
||||||
|
[ -n "${name}" ] || continue
|
||||||
|
size="$(wc -c < "${path}" | tr -d ' ')"
|
||||||
|
write_u32le "${#name}" >> "${out}"
|
||||||
|
write_u32le "${size}" >> "${out}"
|
||||||
|
printf '%s' "${name}" >> "${out}"
|
||||||
|
cat "${path}" >> "${out}"
|
||||||
|
done < "${list}"
|
||||||
|
EOF
|
||||||
|
chmod +x make-payload.sh
|
||||||
|
./make-payload.sh payload.img payload.list
|
||||||
|
|
||||||
|
3. Attach ``payload.img`` as an additional raw disk when booting in QEMU, or
|
||||||
|
as the second physical disk on bare metal.
|
||||||
|
|
||||||
|
Notes:
|
||||||
|
|
||||||
|
* ``payload.img`` is used in kernel-bootstrap offline mode regardless of
|
||||||
|
``--build-guix-also``. With ``--build-guix-also``, the payload content is
|
||||||
|
larger because it also includes post-early sources from ``steps-guix``.
|
||||||
|
* The runtime importer identifies the correct disk by checking the magic
|
||||||
|
``LBPAYLD1`` on each detected block device, not by assuming a device name.
|
||||||
|
|
||||||
Mirrors
|
Mirrors
|
||||||
-------
|
-------
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -37,10 +37,14 @@ class Generator():
|
||||||
self.repo_path = repo_path
|
self.repo_path = repo_path
|
||||||
self.mirrors = mirrors
|
self.mirrors = mirrors
|
||||||
self.build_guix_also = build_guix_also
|
self.build_guix_also = build_guix_also
|
||||||
self.source_manifest = self.get_source_manifest(not self.external_sources,
|
self.source_manifest = self.get_source_manifest(
|
||||||
build_guix_also=self.build_guix_also)
|
stop_before_improve=("get_network" if not self.external_sources else None),
|
||||||
self.early_source_manifest = self.get_source_manifest(True,
|
build_guix_also=self.build_guix_also
|
||||||
build_guix_also=self.build_guix_also)
|
)
|
||||||
|
self.early_source_manifest = self.get_source_manifest(
|
||||||
|
stop_before_improve="get_network",
|
||||||
|
build_guix_also=self.build_guix_also
|
||||||
|
)
|
||||||
self.bootstrap_source_manifest = self.source_manifest
|
self.bootstrap_source_manifest = self.source_manifest
|
||||||
self.payload_source_manifest = []
|
self.payload_source_manifest = []
|
||||||
self.payload_image = None
|
self.payload_image = None
|
||||||
|
|
@ -59,11 +63,17 @@ class Generator():
|
||||||
"""
|
"""
|
||||||
Split early source payload from full offline payload.
|
Split early source payload from full offline payload.
|
||||||
"""
|
"""
|
||||||
# Keep the early builder payload small enough to avoid overrunning
|
# Keep the early builder payload small: include only sources needed
|
||||||
# builder-hex0 memory file allocation before we can jump into Fiwix.
|
# before improve: import_payload runs, so payload.img is the primary
|
||||||
self.bootstrap_source_manifest = self.get_source_manifest(True, build_guix_also=False)
|
# carrier for the rest of the offline distfiles.
|
||||||
|
self.bootstrap_source_manifest = self.get_source_manifest(
|
||||||
|
stop_before_improve="import_payload",
|
||||||
|
build_guix_also=False
|
||||||
|
)
|
||||||
|
|
||||||
full_manifest = self.get_source_manifest(False, build_guix_also=self.build_guix_also)
|
full_manifest = self.get_source_manifest(build_guix_also=self.build_guix_also)
|
||||||
|
if self.bootstrap_source_manifest == full_manifest:
|
||||||
|
raise ValueError("steps/manifest must include `improve: import_payload` in kernel-bootstrap mode.")
|
||||||
bootstrap_set = set(self.bootstrap_source_manifest)
|
bootstrap_set = set(self.bootstrap_source_manifest)
|
||||||
self.payload_source_manifest = [entry for entry in full_manifest if entry not in bootstrap_set]
|
self.payload_source_manifest = [entry for entry in full_manifest if entry not in bootstrap_set]
|
||||||
|
|
||||||
|
|
@ -83,9 +93,10 @@ class Generator():
|
||||||
self.check_file(distfile_path, checksum)
|
self.check_file(distfile_path, checksum)
|
||||||
|
|
||||||
def _create_raw_payload_image(self, target_path, manifest):
|
def _create_raw_payload_image(self, target_path, manifest):
|
||||||
if not manifest:
|
if manifest is None:
|
||||||
return None
|
manifest = []
|
||||||
|
|
||||||
|
if manifest:
|
||||||
# Guarantee all payload distfiles exist and match checksums.
|
# Guarantee all payload distfiles exist and match checksums.
|
||||||
self._ensure_manifest_distfiles(manifest)
|
self._ensure_manifest_distfiles(manifest)
|
||||||
|
|
||||||
|
|
@ -180,7 +191,9 @@ class Generator():
|
||||||
if self.repo_path or self.external_sources:
|
if self.repo_path or self.external_sources:
|
||||||
mkfs_args = ['-d', os.path.join(target.path, 'external')]
|
mkfs_args = ['-d', os.path.join(target.path, 'external')]
|
||||||
target.add_disk("external", filesystem="ext3", mkfs_args=mkfs_args)
|
target.add_disk("external", filesystem="ext3", mkfs_args=mkfs_args)
|
||||||
elif self.payload_source_manifest:
|
else:
|
||||||
|
# Offline kernel-bootstrap mode keeps the early image small and
|
||||||
|
# puts remaining distfiles in payload.img.
|
||||||
self.payload_image = self._create_raw_payload_image(target.path, self.payload_source_manifest)
|
self.payload_image = self._create_raw_payload_image(target.path, self.payload_source_manifest)
|
||||||
target.add_existing_disk("payload", self.payload_image)
|
target.add_existing_disk("payload", self.payload_image)
|
||||||
elif using_kernel:
|
elif using_kernel:
|
||||||
|
|
@ -420,7 +433,7 @@ this script the next time")
|
||||||
self.check_file(path, line[0])
|
self.check_file(path, line[0])
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
def get_source_manifest(cls, pre_network=False, build_guix_also=False):
|
def get_source_manifest(cls, stop_before_improve=None, build_guix_also=False):
|
||||||
"""
|
"""
|
||||||
Generate a source manifest for the system.
|
Generate a source manifest for the system.
|
||||||
"""
|
"""
|
||||||
|
|
@ -443,10 +456,13 @@ this script the next time")
|
||||||
|
|
||||||
with open(manifest_path, 'r', encoding="utf_8") as file:
|
with open(manifest_path, 'r', encoding="utf_8") as file:
|
||||||
for line in file:
|
for line in file:
|
||||||
if pre_network and line.strip().startswith("improve: ") and "network" in line:
|
stripped = line.strip()
|
||||||
|
if stop_before_improve and stripped.startswith("improve: "):
|
||||||
|
improve_step = stripped.split(" ")[1].split("#")[0].strip()
|
||||||
|
if improve_step == stop_before_improve:
|
||||||
break
|
break
|
||||||
|
|
||||||
if not line.strip().startswith("build: "):
|
if not stripped.startswith("build: "):
|
||||||
continue
|
continue
|
||||||
|
|
||||||
step = line.split(" ")[1].split("#")[0].strip()
|
step = line.split(" ")[1].split("#")[0].strip()
|
||||||
|
|
|
||||||
|
|
@ -71,12 +71,14 @@ with open(config_path, "r", encoding="utf-8") as cfg:
|
||||||
if not line.startswith("BUILD_GUIX_ALSO=")
|
if not line.startswith("BUILD_GUIX_ALSO=")
|
||||||
and not line.startswith("MIRRORS=")
|
and not line.startswith("MIRRORS=")
|
||||||
and not line.startswith("MIRRORS_LEN=")
|
and not line.startswith("MIRRORS_LEN=")
|
||||||
|
and not line.startswith("PAYLOAD_REQUIRED=")
|
||||||
]
|
]
|
||||||
if build_guix_also:
|
if build_guix_also:
|
||||||
lines.append("BUILD_GUIX_ALSO=True\\n")
|
lines.append("BUILD_GUIX_ALSO=True\\n")
|
||||||
if mirrors:
|
if mirrors:
|
||||||
lines.append(f'MIRRORS="{" ".join(mirrors)}"\\n')
|
lines.append(f'MIRRORS="{" ".join(mirrors)}"\\n')
|
||||||
lines.append(f"MIRRORS_LEN={len(mirrors)}\\n")
|
lines.append(f"MIRRORS_LEN={len(mirrors)}\\n")
|
||||||
|
lines.append("PAYLOAD_REQUIRED=False\\n")
|
||||||
with open(config_path, "w", encoding="utf-8") as cfg:
|
with open(config_path, "w", encoding="utf-8") as cfg:
|
||||||
cfg.writelines(lines)
|
cfg.writelines(lines)
|
||||||
|
|
||||||
|
|
@ -470,9 +472,10 @@ print(shutil.which('chroot'))
|
||||||
arg_list += [
|
arg_list += [
|
||||||
'-drive', 'file=' + target.get_disk("external") + ',format=raw',
|
'-drive', 'file=' + target.get_disk("external") + ',format=raw',
|
||||||
]
|
]
|
||||||
if target.get_disk("payload") is not None:
|
payload_disk = target.get_disk("payload")
|
||||||
|
if payload_disk is not None:
|
||||||
arg_list += [
|
arg_list += [
|
||||||
'-drive', 'file=' + target.get_disk("payload") + ',format=raw',
|
'-drive', 'file=' + payload_disk + ',format=raw',
|
||||||
]
|
]
|
||||||
arg_list += [
|
arg_list += [
|
||||||
'-machine', 'kernel-irqchip=split',
|
'-machine', 'kernel-irqchip=split',
|
||||||
|
|
|
||||||
|
|
@ -7,5 +7,31 @@ set -ex
|
||||||
|
|
||||||
if [ "${PAYLOAD_REQUIRED}" = True ]; then
|
if [ "${PAYLOAD_REQUIRED}" = True ]; then
|
||||||
mkdir -p /external/distfiles
|
mkdir -p /external/distfiles
|
||||||
payload-import /external/distfiles
|
found_payload=0
|
||||||
|
|
||||||
|
# Probe all block devices reported by the running kernel instead of
|
||||||
|
# assuming fixed names like /dev/sdb or /dev/hdb.
|
||||||
|
while read -r major minor blocks name; do
|
||||||
|
case "${name}" in
|
||||||
|
""|name|ram*|loop*|fd*|sr*|md*)
|
||||||
|
continue
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
|
||||||
|
dev_path="/dev/${name}"
|
||||||
|
if [ ! -b "${dev_path}" ]; then
|
||||||
|
mknod -m 600 "${dev_path}" b "${major}" "${minor}" || :
|
||||||
|
fi
|
||||||
|
|
||||||
|
if payload-import --probe "${dev_path}"; then
|
||||||
|
payload-import --device "${dev_path}" /external/distfiles
|
||||||
|
found_payload=1
|
||||||
|
break
|
||||||
|
fi
|
||||||
|
done < /proc/partitions
|
||||||
|
|
||||||
|
if [ "${found_payload}" != 1 ]; then
|
||||||
|
echo "payload-import failed: no payload image found on detected block devices." >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue