================================================================================
CODE_BUILD_ORDER_AND_MODULE_PLAN_5_12_26.txt
CollabORhythm / Collabtunes — Engineering Blueprint Phase
Generated: 5.12.26 | Black Claude — Blueprint Session
PURPOSE: Exact module breakdown and build order for heavy coding phase
STATUS: ENGINEERING PREP — this is what Mixed Claude builds from
================================================================================

READING ORDER FOR MIXED CLAUDE (HEAVY CODING PHASE):
  1. FINAL_SELF_GATHERING_CODE_SPEC_5_12_26.txt     → master spec
  2. THIS FILE                                       → module plan and build order
  3. SELF_GATHERING_CODE_1_FLOWMAP_5_12_26.txt       → SGC-1 phase-by-phase logic
  4. SELF_GATHERING_CODE_2_FLOWMAP_5_12_26.txt       → SGC-2 phase-by-phase logic
  5. INPUT_OUTPUT_DEPENDENCY_MAP_5_12_26.txt         → I/O contracts
  6. SAFE_RUN_AND_ROLLBACK_PROCEDURES_5_12_26.txt    → safety rules to enforce in code

================================================================================
SECTION 1 — BUILD ORDER (Which modules get built first)
================================================================================

REASON FOR ORDER:
  Shared modules must exist before code-specific modules import them.
  Safety guards must exist before any network or file I/O runs.
  Dry-run mode must work before live-run mode is activated.

BUILD SEQUENCE:

STAGE 1 — SHARED FOUNDATION (build these first, used by both SGC-1 and SGC-2)
  Module 1: shared_config.py
  Module 2: shared_logger.py
  Module 3: shared_output_writer.py
  Module 4: shared_checkpoint.py
  Module 5: shared_naming_validator.py

STAGE 2 — SGC-1 MODULES (build in this order)
  Module 6:  sgc1_args.py
  Module 7:  sgc1_seed_loader.py
  Module 8:  sgc1_url_normalizer.py
  Module 9:  sgc1_http_requester.py
  Module 10: sgc1_nav_crossref.py
  Module 11: sgc1_body_parser.py
  Module 12: sgc1_routing_classifier.py
  Module 13: sgc1_conflict_detector.py
  Module 14: sgc1_deduplicator.py
  Module 15: sgc1_exporter.py
  Module 16: sgc1_main.py (orchestrator — imports all SGC-1 modules)

STAGE 3 — SGC-2 MODULES (build in this order)
  Module 17: sgc2_args.py
  Module 18: sgc2_dir_walker.py
  Module 19: sgc2_sensitive_guard.py
  Module 20: sgc2_filename_parser.py
  Module 21: sgc2_zip_inspector.py
  Module 22: sgc2_txt_scanner.py
  Module 23: sgc2_folder_validator.py
  Module 24: sgc2_dependency_mapper.py
  Module 25: sgc2_deduplicator.py
  Module 26: sgc2_exporter.py
  Module 27: sgc2_main.py (orchestrator — imports all SGC-2 modules)

STAGE 4 — INTEGRATION
  Module 28: run_both.py (optional runner — executes SGC-2 then SGC-1 in sequence)

================================================================================
SECTION 2 — SHARED MODULES (Stage 1)
================================================================================

MODULE 1: shared_config.py
  PURPOSE: Central constants and configuration values
  CONTAINS:
    ALLOWED_OUTPUT_ROOT = "./outputs/"
    ALLOWED_LOG_ROOT = "./logs/"
    SENSITIVE_FILENAME_PATTERNS = [
      "DEFAMATION_RISK_REGISTRY",
      "CREATOR_INTERVIEW_TRANSCRIPT",
      "MASTER_DUMPS"
    ]
    BANNED_FILENAMES = ["final", "output", "new", "fixed", "temp", "test",
                         "untitled", "copy", "export", "revised"]
    APPROVED_CATEGORIES = [
      "LIVE_CAPTURE", "CANON", "QA", "HTML", "REGISTRY", "DEPLOYMENT",
      "BACKUP", "PLACEHOLDERS", "RATINGS", "URL_MAPS", "BLOCKERS",
      "NAV_STABILIZATION", "TOM_DECISIONS", "MASTER_DUMPS", "GENERATOR",
      "HANDOFF", "MANIFESTS", "AUTHORITY", "PREHANDOFF", "ROLLBACK"
    ]
    PAGE_TYPES = ["ALBUM_AIO", "NAV", "SONGBOOK", "QUICKGUIDE", "HGIH",
                   "READMYSTUFF", "PLACEHOLDER", "DEV", "UNKNOWN"]
    AUTHORITY_LEVELS = ["LOCKED", "AUTHORITATIVE", "REFERENCE", "DEPRECATED",
                         "WORKING", "CONFLICT", "UNKNOWN"]
    RATE_LIMIT_SECONDS = 1.5   # SGC-1 HEAD requests
    BODY_RATE_LIMIT_SECONDS = 2.0  # SGC-1 GET requests
    DEFAULT_MAX_PAGES = 200
    DEFAULT_MAX_ZIP_DEPTH = 2
    TXT_HEADER_SCAN_CHARS = 2000
    TXT_FULL_READ_MAX_BYTES = 50_000
  IMPORTS: none (no circular dependencies)

---

MODULE 2: shared_logger.py
  PURPOSE: Centralized run log with timestamps and severity levels
  CONTAINS:
    class RunLogger:
      __init__(run_id, mode, log_dir)
      log(step, message, severity="INFO")  → appends to run_log[]
      log_error(step, error_code, message) → appends + increments error_count
      log_abort(reason)                    → writes ABORT entry, raises SystemExit
      get_log()                            → returns run_log[]
      write_log_to_file()                  → writes run_log to /logs/
    Severity levels: DEBUG | INFO | WARNING | ERROR | CRITICAL | ABORT
  IMPORTS: shared_config

---

MODULE 3: shared_output_writer.py
  PURPOSE: All file writes go through this — enforces path safety
  CONTAINS:
    def validate_output_path(filepath):
      assert filepath starts with ALLOWED_OUTPUT_ROOT or ALLOWED_LOG_ROOT
      assert not os.path.exists(filepath)  → raises if would overwrite
      return True
    def write_json(filepath, data, logger):
      validate_output_path(filepath)
      json.dump(data, open(filepath, 'w'), indent=2)
      verify: os.path.getsize(filepath) > 0 (else log WRITE_ERROR)
    def write_txt(filepath, text, logger):
      validate_output_path(filepath)
      open(filepath, 'w').write(text)
      verify: os.path.getsize(filepath) > 0
    def generate_unique_filename(base, ext, dir):
      → timestamp to millisecond precision
      → checks for collision, appends _[N] if needed
  IMPORTS: shared_config, shared_logger

---

MODULE 4: shared_checkpoint.py
  PURPOSE: Checkpoint write/read/resume logic
  CONTAINS:
    def write_checkpoint(run_id, phase, data, log_dir):
      filename = f"{run_id}_CHECKPOINT_{phase}.json"
      write to /logs/
    def read_checkpoint(checkpoint_filepath):
      → load JSON, return {run_id, phase, data}
    def detect_partial_run(log_dir, run_prefix):
      → scan /logs/ for PARTIAL checkpoints from current or prior runs
      → return list of incomplete runs if found
    def checkpoint_phase_key(phase_number):
      → returns standard key: "PHASE{N}_COMPLETE"
  IMPORTS: shared_config, shared_output_writer, shared_logger

---

MODULE 5: shared_naming_validator.py
  PURPOSE: Validate filenames against MASTER_NAMING_STANDARD
  CONTAINS:
    DATE_PATTERN = re.compile(r'_\d{1,2}_\d{2}_\d{2}')
    VOL_PATTERN = re.compile(r'VOL(\d+)')
    AUTHORITY_TAG_PATTERN = re.compile(
      r'(ACTIVE_CANON|DEPRECATED|ARCHIVE|SUPERSEDED|LOCKED)')
    def parse_filename(filename):
      → extract: file_count, category[], date_token, vol_number, authority_tag,
                  project_tag, purpose
      → return dict
    def validate_filename(filename):
      → check for date, category, no banned words
      → return {compliant: bool, violations: []}
    def infer_authority_from_filename(filename):
      → ACTIVE_CANON → LOCKED
      → DEPRECATED or ARCHIVE → DEPRECATED
      → SUPERSEDED → REFERENCE
      → FIXED_COLOR → AUTHORITATIVE (known-good pattern for HTML files)
      → highest VOL → AUTHORITATIVE (if multiple files same base)
      → default → UNKNOWN
  IMPORTS: shared_config

================================================================================
SECTION 3 — SGC-1 MODULES (Stage 2)
================================================================================

MODULE 6: sgc1_args.py
  PURPOSE: CLI argument parsing for SGC-1
  CONTAINS:
    def parse_args() → argparse.Namespace with:
      mode, include_x, max_pages, output_dir, seed_file, nav_file, resume
  IMPORTS: argparse, shared_config

---

MODULE 7: sgc1_seed_loader.py
  PURPOSE: Load and parse MASTER_URL_AUTHORITY_REGISTRY
  CONTAINS:
    def load_seed_urls(filepath):
      → parse TXT file line by line
      → extract URL, status flag (✅/❌/⏳/⚠️/🔶/🔧), label, rating, section
      → return seed_urls[] list of dicts
    def filter_for_crawl(seed_urls, include_x):
      → skip PENDING (⏳), DEV (🔧), BROKEN (❌)
      → skip X_RATED unless include_x flag set
      → return crawl_urls[]
    def load_nav_reference(filepath):
      → parse FINAL_NAVIGATION_AUTHORITY_MAP
      → extract: 14-section structure, chapter drift map, routing logic
      → return nav_reference dict
  IMPORTS: shared_config, shared_logger

---

MODULE 8: sgc1_url_normalizer.py
  PURPOSE: URL standardization
  CONTAINS:
    BASE_URL = "https://collabtunes.com"
    def normalize(url):
      → ensure https:// prefix
      → add trailing slash if missing
      → lowercase slug for comparison
      → return canonical form
    def to_slug(url):
      → strip base URL, return /slug/ only
    def classify_page_type(url, label=""):
      → apply URL pattern matching rules from Flowmap Phase 1.3
      → return page_type string
  IMPORTS: urllib.parse, shared_config

---

MODULE 9: sgc1_http_requester.py
  PURPOSE: Rate-limited HTTP requests with retry logic
  CONTAINS:
    def head_request(url, timeout=10, logger=None):
      → send HEAD request
      → return {status_code, redirect_url, response_time, error}
    def get_request(url, timeout=15, logger=None):
      → send GET request
      → return {status_code, content, response_time, error}
    def batch_head_requests(urls, rate_limit, logger):
      → iterate urls, sleep between requests, collect results
    RATE_LIMIT: enforced via time.sleep between every request
    RETRY: max 1 retry on timeout, then TIMEOUT_ERROR
    GUARD: verify all requests are GET or HEAD only
           if any other method attempted: log SITE_WRITE_DETECTED + abort
  IMPORTS: requests, time, shared_config, shared_logger

---

MODULE 10: sgc1_nav_crossref.py
  PURPOSE: Cross-reference each URL against all 3 nav sources
  CONTAINS:
    def parse_nav_sources(live_capture_data, nav_reference):
      → build: nav_source_urls["homepage"] = set(urls)
      → build: nav_source_urls["128-nav"] = set(urls)
      → build: nav_source_urls["quicklinks"] = set(urls)
    def crossref_url(url, nav_source_urls):
      → return nav_sources[] list (which sources include this URL)
    def detect_orphans(results, nav_source_urls):
      → return urls that are LIVE but in no nav source
    def detect_nav_duplicates(nav_source_urls):
      → return urls appearing twice in same nav source
    def detect_cross_source_slug_mismatches(nav_source_urls, label_map):
      → same label, different URL slug across sources
      → return mismatch list with conflict objects
  IMPORTS: shared_config, shared_logger

---

MODULE 11: sgc1_body_parser.py
  PURPOSE: Parse HTML body content for pages needing it
  CONTAINS:
    def needs_body_crawl(result_item, prior_captured_set):
      → True if: LIVE, not PLACEHOLDER/DEV, not already captured, not X_RATED (unless flag)
    def extract_body_data(html_content, url):
      → BeautifulSoup parse
      → return: page_title, h1_text, meta_description, internal_links[],
                 word_count, has_rating_badge, has_js_gate, visible_text_excerpt
    def detect_js_rendered(html_content):
      → word_count < 50 AND noscript present → return True
    def extract_internal_links(soup, base_url):
      → all <a href> pointing to collabtunes.com domain
      → normalize each, return list
  IMPORTS: beautifulsoup4, lxml, shared_config

---

MODULE 12: sgc1_routing_classifier.py
  PURPOSE: Assign rating, gate requirement, and safe route to each item
  CONTAINS:
    RATING_PRIORITY = ["LOCKED_CANON", "RATINGS_INDEX", "URL_SELF_DECLARED",
                        "BODY_BADGE", "PENDING"]
    def assign_rating(url, label, canon_data, ratings_data):
      → check priority list, return first match
    def assign_gate(rating):
      → return gate_requirement string per rating
    def assign_safe_route(rating, has_gate_in_body):
      → return safe_route classification
    def detect_gate_missing(result_item):
      → rating is R+ AND has_js_gate is False → flag GATE_MISSING
      → cross-reference against known blockers from UNRESOLVED_BLOCKERS
  IMPORTS: shared_config, shared_logger

---

MODULE 13: sgc1_conflict_detector.py
  PURPOSE: Detect and record all conflicts
  CONTAINS:
    KNOWN_BLOCKERS = {
      "CC-LW": ["BLOCK-H04", "/20-35-the-lady-weaver/", "/36-35-lady-weaver/"],
      "CC-CH18": ["BLOCK-H02", "/18-of-35-business-plan/", "/18-of-35-project-summaries/"],
      "CC-YT-URL": ["BLOCK-M01", "/lyric-videos-...-video/", "/lyric-videos-...-youtube-video/"],
      ... (all from FINAL_CANON_AUTHORITY_REGISTRY open conflicts section)
    }
    def detect_status_mismatch(result_item, expected_status):
      → generate conflict object if mismatch
    def detect_chapter_drift(url, nav_label_number):
      → compare URL slug number vs nav label number
      → generate conflict if different
    def build_chapter_drift_map(results):
      → return: list of {url, url_number, nav_label_number, delta}
    def link_to_known_blocker(conflict):
      → scan KNOWN_BLOCKERS dict, attach blocker_id if matched
    def deduplicate_conflicts(conflicts):
      → remove exact-duplicate conflict objects
  IMPORTS: shared_config, shared_logger

---

MODULE 14: sgc1_deduplicator.py
  PURPOSE: Deduplicate results and designate canonical URLs
  CONTAINS:
    def find_url_duplicates(results):
      → group by normalized_slug
      → return groups with >1 member
    def designate_canonical(duplicate_group, url_registry):
      → check url_registry for explicit canonical designation
      → if designated: mark others as NON_CANONICAL
      → if not: mark all as CONFLICT
    def remove_result_duplicates(results):
      → exact same URL object appearing twice → remove second occurrence
      → mark removed as DUPLICATE_ENTRY in first occurrence
  IMPORTS: shared_config

---

MODULE 15: sgc1_exporter.py
  PURPOSE: Write all output files
  CONTAINS:
    def build_json_output(results, conflicts, flags, run_log, run_meta):
      → assemble full JSON per SPEC schema
    def build_txt_summary(results, conflicts, flags, run_meta):
      → human-readable formatted text per Flowmap Phase 7.2 spec
    def build_manifest(results, conflicts, flags, output_files, run_meta):
      → run manifest text per Flowmap Phase 7.3 spec
    def write_all(json_data, txt_data, manifest_data, run_id, output_dir, logger):
      → call shared_output_writer for each file
      → verify each write succeeded
  IMPORTS: shared_output_writer, shared_config, shared_logger, json

---

MODULE 16: sgc1_main.py (ORCHESTRATOR)
  PURPOSE: Orchestrate all SGC-1 phases in order
  CONTAINS:
    def main():
      args = sgc1_args.parse_args()
      logger = RunLogger(run_id, args.mode, args.output_dir)
      IF args.resume: checkpoint = shared_checkpoint.read_checkpoint(args.resume)
      PHASE 0: pre-flight (validate paths, load seeds, load nav, write start checkpoint)
      PHASE 1: URL normalization + crawl queue build
      IF DRY_RUN: log phases that would run, exit
      PHASE 2: HTTP HEAD requests (with confirmation gate for LIVE_RUN)
      checkpoint_phase(2)
      PHASE 3: Nav cross-reference
      PHASE 4: Body gathering (selective)
      checkpoint_phase(4)
      PHASE 5: Routing classification
      PHASE 6: Conflict detection
      PHASE 7: Dedupe
      PHASE 8: Export
      PHASE 8R: Rollback safety verification (mtime checks, no-write confirmation)
      logger.write_log_to_file()
  IMPORTS: all SGC-1 modules, all shared modules

================================================================================
SECTION 4 — SGC-2 MODULES (Stage 3)
================================================================================

MODULE 17: sgc2_args.py
  PURPOSE: CLI argument parsing for SGC-2
  CONTAINS:
    def parse_args() → argparse.Namespace with:
      mode, root, output_dir, max_zip_depth, skip_sensitive, resume
  IMPORTS: argparse, shared_config

---

MODULE 18: sgc2_dir_walker.py
  PURPOSE: Recursive directory scan
  CONTAINS:
    def walk_directory(root_path):
      → os.walk() entire tree
      → collect: filepath, filename, extension, size_bytes, mtime, parent_folder
      → return raw_file_list[], folder_list[]
    def record_mtime_baseline(raw_file_list):
      → {filepath: mtime} for all files
      → used for integrity verification at end of run
    def verify_mtimes_unchanged(baseline, current):
      → compare current os.stat(f).st_mtime against baseline
      → return list of changed files (should be empty)
  IMPORTS: os, os.path, shared_config, shared_logger

---

MODULE 19: sgc2_sensitive_guard.py
  PURPOSE: Detect and isolate sensitive files BEFORE any reading logic runs
  CONTAINS:
    def is_sensitive(filename):
      → check against SENSITIVE_FILENAME_PATTERNS from shared_config
      → return bool
    def partition_file_list(raw_file_list):
      → return (safe_files[], sensitive_files[])
      → sensitive files: catalog path + existence only
    def verify_sensitive_not_opened(sensitive_files, opened_files_log):
      → confirm no sensitive filepath appears in opened_files_log
      → if found: log SENSITIVE_OPENED + abort
  IMPORTS: shared_config, shared_logger

---

MODULE 20: sgc2_filename_parser.py
  PURPOSE: Parse filename metadata for all safe files
  CONTAINS:
    def parse_all(safe_file_list):
      → call shared_naming_validator.parse_filename() for each file
      → enrich each item with: category[], date_token, vol_number, authority_tag, etc.
    def check_compliance(safe_file_list):
      → call shared_naming_validator.validate_filename() for each file
      → return violations[] list
    def infer_authority(file_item):
      → call shared_naming_validator.infer_authority_from_filename()
  IMPORTS: shared_naming_validator, shared_config

---

MODULE 21: sgc2_zip_inspector.py
  PURPOSE: Open ZIPs read-only, list members, extract manifests
  CONTAINS:
    def inspect_zip(filepath, max_depth, current_depth=0, logger=None):
      → open with zipfile.ZipFile(filepath, 'r')
      → list members: name, size, date_time
      → if member matches MANIFEST/README pattern: extract text only
      → if member is a .zip AND current_depth < max_depth: recurse
      → if current_depth >= max_depth: flag MAX_DEPTH_REACHED
      → return zip_data{member_count, members[], manifest_text, nested_zips[]}
    def parse_manifest_text(manifest_text):
      → extract: ZIP_NAME, PHASE, CATEGORY, AUTHORITATIVE, CONTAINS, etc.
      → return manifest_data dict
    def crossref_manifest_vs_members(manifest_data, members):
      → compare manifest CONTAINS list vs actual members
      → return mismatches[] → MANIFEST_MISMATCH flags
    def detect_missing_manifests(zip_list):
      → return ZIPs where no manifest member was found → MISSING_MANIFEST flags
  IMPORTS: zipfile, shared_config, shared_logger

---

MODULE 22: sgc2_txt_scanner.py
  PURPOSE: Header scan of TXT files for status/authority/supersedes
  CONTAINS:
    opened_files_log = []  # track every file opened for integrity verification
    def scan_txt(filepath, logger):
      opened_files_log.append(filepath)
      → read first TXT_HEADER_SCAN_CHARS characters
      → if size < TXT_FULL_READ_MAX_BYTES: read full file
      → extract: doc_title, status_line, generated_line, purpose_line,
                   supersedes_refs, prior_version_refs
      → return txt_metadata dict
    def extract_authority_from_status(status_line):
      → match known status markers → return authority level
    def build_supersedes_graph(txt_metadata_list):
      → for each file with supersedes_refs: link to referenced files
      → build bidirectional: supersedes[] and superseded_by
  IMPORTS: shared_config, shared_logger

---

MODULE 23: sgc2_folder_validator.py
  PURPOSE: Compare actual folder structure vs expected
  CONTAINS:
    EXPECTED_FOLDERS = [
      "00_OPERATIONAL_RULES", "01_LIVE_CAPTURE", "02_CANON", "03_RATINGS",
      "04_BLOCKERS", "05_URL_MAPS", "06_NAV_STABILIZATION", "07_DEPLOYMENT",
      "08_PLACEHOLDERS", "09_HTML_PROTOTYPES", "10_QA", "11_REGISTRY",
      "12_MANIFESTS", "13_SOURCE_ZIPS", "14_GENERATED_OUTPUT"
    ]
    def validate_structure(folder_list, root_path):
      → compare folder_list against EXPECTED_FOLDERS
      → return: {present[], missing[], unexpected[]}
    def validate_file_placement(file_item, expected_folders):
      → check if file category[] matches its parent folder
      → return placement_status: CORRECT | MISPLACED | UNEXPECTED_FOLDER
  IMPORTS: shared_config, shared_logger

---

MODULE 24: sgc2_dependency_mapper.py
  PURPOSE: Map dependency chains and generator input readiness
  CONTAINS:
    DEPENDENCY_CHAINS = {
      "CHAIN_A_RATING_GATE": ["FRONT_DOOR_BOUNCER", "RATINGS_INDEX", "mood_settings"],
      "CHAIN_B_AIO_GENERATOR": ["mood_settings", "GX_data", "SL1_tracks", "template"],
      "CHAIN_C_NAV_INTEGRITY": ["chapter_drift_decision", "ch18_decision", "LadyWeaver"],
      "CHAIN_D_CROSSLINKS": ["URL_conflicts_resolved", "URL_AUTHORITY_REGISTRY"],
      "CHAIN_E_DEFAMATION": ["DEFAMATION_RISK_REGISTRY"]  // catalog only
    }
    GENERATOR_REQUIRED_INPUTS = [
      "mood_settings_ratings_explicit_for_all_34_albums",
      "MASTER_CONTENT_RATINGS_INDEX_VOL3",
      "FINAL_CANON_AUTHORITY_REGISTRY",
      "MASTER_URL_AUTHORITY_REGISTRY",
      "HTML_TESTER_NUMBER_TWO_FIXED_COLOR"
    ]
    def assess_chain(chain_name, file_inventory):
      → check if all required files in chain are present with AUTHORITATIVE status
      → return: COMPLETE | MISSING (list which) | BLOCKED
    def assess_generator_inputs(file_inventory):
      → check each GENERATOR_REQUIRED_INPUTS against inventory
      → flag CRITICAL_MISSING if mood_settings not found
      → return generator_inputs[] with status per input
  IMPORTS: shared_config, shared_logger

---

MODULE 25: sgc2_deduplicator.py
  PURPOSE: Group files by base_name, identify version chains
  CONTAINS:
    def group_by_base_name(file_list):
      → strip date and VOL from filename to get base_name
      → group files with same base_name
      → return groups dict: {base_name: [file_items]}
    def tag_authority_within_group(group):
      → most recent date OR highest VOL → AUTHORITATIVE
      → others → REFERENCE or DEPRECATED (per existing authority tag)
      → if two same date, different content → AMBIGUOUS_VERSION flag
    def build_version_chains(groups):
      → return version_chains[]: [{base_name, versions[], current}]
    def find_orphan_files(file_list):
      → files with no manifest, not part of any group, not SENSITIVE
      → return orphan_files[]
  IMPORTS: shared_config, shared_naming_validator

---

MODULE 26: sgc2_exporter.py
  PURPOSE: Write all SGC-2 output files
  CONTAINS:
    def build_json_output(files, folders, zips, chains, dep_chains, gen_inputs,
                          conflicts, flags, sensitive_files, run_log, run_meta):
      → assemble full JSON per SPEC schema
    def build_txt_summary(...):
      → human-readable formatted text per Flowmap Phase 8.2 spec
    def build_manifest(counts, output_files, run_meta, top_flags):
      → run manifest per Flowmap Phase 8.3 spec
    def write_all(json_data, txt_data, manifest_data, run_id, output_dir, logger):
      → call shared_output_writer for each file
  IMPORTS: shared_output_writer, shared_config, shared_logger, json

---

MODULE 27: sgc2_main.py (ORCHESTRATOR)
  PURPOSE: Orchestrate all SGC-2 phases in order
  CONTAINS:
    def main():
      args = sgc2_args.parse_args()
      logger = RunLogger(run_id, args.mode, args.output_dir)
      PHASE 0: pre-flight (validate paths, write start checkpoint)
      PHASE 1: dir_walker.walk_directory() + mtime baseline
      PHASE 0.5: sensitive_guard.partition_file_list() (BEFORE anything else reads files)
      checkpoint_phase(1)
      PHASE 2: filename_parser.parse_all() + check_compliance()
      PHASE 3: zip_inspector.inspect_zip() for each ZIP
      checkpoint_phase(3)
      PHASE 4: txt_scanner.scan_txt() for each TXT (safe only)
      PHASE 5: folder_validator.validate_structure() + validate_file_placement()
      PHASE 6: dependency_mapper.assess_chain() + assess_generator_inputs()
      checkpoint_phase(6)
      PHASE 7: deduplicator.group_by_base_name() + tag_authority() + build_chains()
      PHASE 8: sgc2_exporter.write_all()
      PHASE 9: verify mtimes unchanged + verify no sensitive files opened
      logger.write_log_to_file()
  IMPORTS: all SGC-2 modules, all shared modules

================================================================================
SECTION 5 — MODULE DEPENDENCY GRAPH
================================================================================

shared_config (no imports)
shared_logger ← shared_config
shared_output_writer ← shared_config, shared_logger
shared_checkpoint ← shared_config, shared_output_writer, shared_logger
shared_naming_validator ← shared_config

sgc1_args ← shared_config
sgc1_seed_loader ← shared_config, shared_logger
sgc1_url_normalizer ← shared_config
sgc1_http_requester ← shared_config, shared_logger
sgc1_nav_crossref ← shared_config, shared_logger
sgc1_body_parser ← shared_config
sgc1_routing_classifier ← shared_config, shared_logger
sgc1_conflict_detector ← shared_config, shared_logger
sgc1_deduplicator ← shared_config
sgc1_exporter ← shared_output_writer, shared_config, shared_logger
sgc1_main ← ALL sgc1 modules + ALL shared modules

sgc2_args ← shared_config
sgc2_dir_walker ← shared_config, shared_logger
sgc2_sensitive_guard ← shared_config, shared_logger
sgc2_filename_parser ← shared_naming_validator, shared_config
sgc2_zip_inspector ← shared_config, shared_logger
sgc2_txt_scanner ← shared_config, shared_logger
sgc2_folder_validator ← shared_config, shared_logger
sgc2_dependency_mapper ← shared_config, shared_logger
sgc2_deduplicator ← shared_config, shared_naming_validator
sgc2_exporter ← shared_output_writer, shared_config, shared_logger
sgc2_main ← ALL sgc2 modules + ALL shared modules

run_both.py ← sgc1_main, sgc2_main

NO CIRCULAR DEPENDENCIES — verify before coding.

================================================================================
SECTION 6 — FILE STRUCTURE FOR THE CODE ITSELF
================================================================================

COLLABTUNES_PROJECT_ROOT/
└── 15_GATHERING_TOOLS/
    ├── shared/
    │   ├── shared_config.py
    │   ├── shared_logger.py
    │   ├── shared_output_writer.py
    │   ├── shared_checkpoint.py
    │   └── shared_naming_validator.py
    ├── sgc1/
    │   ├── sgc1_args.py
    │   ├── sgc1_seed_loader.py
    │   ├── sgc1_url_normalizer.py
    │   ├── sgc1_http_requester.py
    │   ├── sgc1_nav_crossref.py
    │   ├── sgc1_body_parser.py
    │   ├── sgc1_routing_classifier.py
    │   ├── sgc1_conflict_detector.py
    │   ├── sgc1_deduplicator.py
    │   ├── sgc1_exporter.py
    │   └── sgc1_main.py
    ├── sgc2/
    │   ├── sgc2_args.py
    │   ├── sgc2_dir_walker.py
    │   ├── sgc2_sensitive_guard.py
    │   ├── sgc2_filename_parser.py
    │   ├── sgc2_zip_inspector.py
    │   ├── sgc2_txt_scanner.py
    │   ├── sgc2_folder_validator.py
    │   ├── sgc2_dependency_mapper.py
    │   ├── sgc2_deduplicator.py
    │   ├── sgc2_exporter.py
    │   └── sgc2_main.py
    ├── run_both.py
    ├── outputs/
    │   └── [all SGC output files land here]
    └── logs/
        └── [all checkpoint and log files land here]

================================================================================
SECTION 7 — WHAT MIXED CLAUDE NEEDS TO DO IN HEAVY CODING PHASE
================================================================================

MIXED CLAUDE'S TASK (after reading all 6 blueprint files):

1. BUILD Stage 1 (5 shared modules) — no network, no file I/O yet
   Test: can import all shared modules without error

2. BUILD sgc1_args.py + sgc1_seed_loader.py
   Test: loads MASTER_URL_AUTHORITY_REGISTRY correctly, parses all URLs

3. BUILD sgc1_url_normalizer.py
   Test: all 121+ URLs normalize correctly, page types classified correctly

4. BUILD sgc1_http_requester.py
   Test in DRY_RUN: logs what requests it would make — no actual network calls
   Test in LIVE_RUN (small batch): HEAD requests to 5 known URLs

5. BUILD sgc1_nav_crossref.py
   Test: cross-reference produces correct nav_sources[] for known URLs

6. BUILD sgc1_body_parser.py
   Test: parse a known collabtunes.com page HTML fixture

7. BUILD sgc1_routing_classifier.py + sgc1_conflict_detector.py
   Test: known conflicts from FINAL_CANON_AUTHORITY_REGISTRY are detected

8. BUILD sgc1_deduplicator.py + sgc1_exporter.py
   Test: output files produced with correct schema and valid JSON

9. ASSEMBLE sgc1_main.py — full SGC-1 dry run test

10. BUILD Stage 3 (SGC-2 modules) in same sequence

11. ASSEMBLE sgc2_main.py — full SGC-2 dry run test

12. BUILD run_both.py — integration test

DELIVERY: Both codes run successfully in DRY_RUN mode.
          Both produce valid JSON + TXT + Manifest outputs.
          All safety guards verified (no writes to site, no sensitive files opened).

================================================================================
END CODE_BUILD_ORDER_AND_MODULE_PLAN_5_12_26.txt
================================================================================
