cru process
Process markdown files in your kiln to enable search and AI features.
Synopsis
Section titled “Synopsis”cru process [OPTIONS]Description
Section titled “Description”The process command parses all markdown files in your kiln and stores structured data in the local database. This enables semantic search, knowledge graph queries, and AI agent integration.
What processing does:
- Parses markdown files for structure
- Extracts frontmatter metadata
- Identifies wikilinks and builds the graph
- Extracts tags (including nested tags)
- Splits content into searchable blocks
- Generates embeddings for semantic search
- Stores everything in the local database (SQLite by default)
Options
Section titled “Options”--force
Section titled “--force”Reprocess all files regardless of whether they’ve changed.
cru process --forceBy default, processing is incremental - only files with changed content are reprocessed.
--watch
Section titled “--watch”Keep watching for file changes and reprocess automatically.
cru process --watchUse Ctrl+C to stop watching.
--dry-run
Section titled “--dry-run”Preview what would be processed without making changes.
cru process --dry-run--parallel <N>
Section titled “--parallel <N>”Set the number of parallel workers.
cru process --parallel 4Default: CPU cores / 2
Incremental Processing
Section titled “Incremental Processing”By default, Crucible uses content hashing to detect changes:
- Calculate hash of file content
- Compare with stored hash
- Only reprocess if different
This makes subsequent runs fast - only changed files are processed.
Force full reprocessing with:
cru process --forceProcessing Pipeline
Section titled “Processing Pipeline”Files go through these stages:
- Discovery - Find all
.mdfiles in kiln - Filtering - Skip ignored directories (
.crucible,.git, etc.) - Hashing - Check for content changes
- Parsing - Extract structure from markdown
- Enrichment - Generate embeddings
- Storage - Write to database
Example Output
Section titled “Example Output”Initializing storage...✓ Storage initializedCreating processing pipeline...✓ Pipeline ready
Processing 38 files through pipeline (with 4 workers)...[========================================] 38/38 Processing: My Note.md
Pipeline processing complete! Processed: 38 files Skipped (unchanged): 0 filesDatabase Location
Section titled “Database Location”Processed data is stored at:
<kiln_path>/.crucible/crucible-sqlite.dbThis is derived data - your markdown files remain the source of truth.
Ignored Patterns
Section titled “Ignored Patterns”These directories are automatically skipped:
.crucible/(database).git/.obsidian/node_modules/
Error Handling
Section titled “Error Handling”Processing continues if individual files fail. Errors are logged but don’t stop the pipeline.
Common issues:
- Invalid frontmatter: YAML parsing errors are logged
- Encoding issues: Non-UTF8 files are skipped
- Permission denied: Inaccessible files are skipped
Watch Mode
Section titled “Watch Mode”With --watch, Crucible monitors your kiln for changes:
cru process --watch- Uses filesystem events for efficiency
- Debounces rapid changes
- Ctrl+C to exit
Performance Tips
Section titled “Performance Tips”For large kilns (>1000 files):
- Use incremental processing (default)
- Adjust parallelism:
--parallel 8 - Use
--dry-runto preview scope
Implementation
Section titled “Implementation”Source code: crates/crucible-cli/src/commands/process.rs
Related modules:
crates/crucible-sqlite/- SQLite storage layer (default)crates/crucible-core/src/parser/- Markdown parsingcrates/crucible-llm/src/embeddings/- Embedding generation
See Also
Section titled “See Also”:h stats- View kiln statistics:h search- Search processed content:h config.embedding- Embedding configuration- Getting Started - Initial setup guide