Tasmota/lib/libesp32/berry/REPOSITORY_MAP.md

150 lines
5.5 KiB
Markdown

# Berry Repository Structure Map
## Overview
Berry is an ultra-lightweight dynamically typed embedded scripting language designed for lower-performance embedded devices. The interpreter core is less than 40KiB and can run on less than 4KiB heap.
## Directory Structure
### `/src/` - Core Source Code (152 files)
**Main Components:**
- **Virtual Machine**: `be_vm.c` (1419 lines) - Register-based VM execution
- **Parser**: `be_parser.c` (1841 lines) - One-pass compiler and syntax analysis
- **Lexer**: `be_lexer.c` (914 lines) - Tokenization and lexical analysis
- **API**: `be_api.c` (1179 lines) - External C API interface
- **Code Generation**: `be_code.c` (983 lines) - Bytecode generation
- **Garbage Collector**: `be_gc.c` (613 lines) - Mark-sweep garbage collection
**Data Types & Objects:**
- **Strings**: `be_string.c` (326 lines), `be_strlib.c` (1137 lines)
- **Lists**: `be_list.c` (207 lines), `be_listlib.c` (556 lines)
- **Maps**: `be_map.c` (354 lines), `be_maplib.c` (265 lines)
- **Classes**: `be_class.c` (374 lines)
- **Functions**: `be_func.c` (183 lines)
- **Bytes**: `be_byteslib.c` (1992 lines) - Binary data handling
**Built-in Libraries:**
- **JSON**: `be_jsonlib.c` (645 lines) - JSON parsing/generation
- **Math**: `be_mathlib.c` (438 lines) - Mathematical functions
- **OS**: `be_oslib.c` (271 lines) - Operating system interface
- **File**: `be_filelib.c` (265 lines) - File I/O operations
- **Debug**: `be_debug.c` (418 lines), `be_debuglib.c` (289 lines)
- **Introspection**: `be_introspectlib.c` (298 lines)
- **Time**: `be_timelib.c` (72 lines)
**Memory & Execution:**
- **Memory Management**: `be_mem.c` (377 lines)
- **Execution**: `be_exec.c` (531 lines)
- **Bytecode**: `be_bytecode.c` (634 lines)
- **Variables**: `be_var.c` (201 lines)
- **Modules**: `be_module.c` (509 lines)
**Headers:**
- **Main Header**: `berry.h` (2395 lines) - Primary API definitions
- **Constants**: `be_constobj.h` (505 lines) - Constant object definitions
### `/tests/` - Unit Tests (54 files)
**Core Language Tests:**
- `assignment.be`, `bool.be`, `class.be`, `closure.be`, `function.be`
- `for.be`, `vararg.be`, `cond_expr.be`, `exceptions.be`
**Data Type Tests:**
- `list.be`, `map.be`, `range.be`, `string.be`, `int.be`, `bytes.be`
**Library Tests:**
- `json.be` (9168 lines) - **Comprehensive JSON security tests**
- `math.be`, `os.be`, `debug.be`, `introspect.be`
**Parser & Compiler Tests:**
- `parser.be`, `lexer.be`, `compiler.be`, `suffix.be`
**Advanced Feature Tests:**
- `virtual_methods.be`, `super_auto.be`, `class_static.be`
- `division_by_zero.be`, `reference.be`, `compound.be`
### `/examples/` - Example Programs (16 files)
- `fib_rec.be` - Fibonacci recursion
- `qsort.be` - Quick sort implementation
- `bintree.be` - Binary tree operations
- `json.be` - JSON usage examples
- `repl.be` - REPL implementation
### `/default/` - Default Configuration (17 files)
- `berry_conf.h` - Configuration settings
- `be_modtab.c` - Module table definitions
- `be_port.c` - Platform-specific code
- `berry.c` - Main executable entry point
### `/generate/` - Generated Files (31 files)
- `be_const_strtab.h` - String table constants
- `be_fixed_*.h` - Fixed/compiled module definitions
- Auto-generated constant definitions
### `/tools/` - Development Tools
**Code Generation:**
- `/coc/` - Compile-on-command tools (13 files)
- Python scripts for code generation and optimization
**Editor Support:**
- `/plugins/vscode/` - Visual Studio Code plugin
- `/plugins/Notepad++/` - Notepad++ syntax highlighting
**Grammar:**
- `berry.ebnf` - EBNF grammar definition
- `berry.bytecode` - Bytecode format specification
## Key Architecture Components
### 1. **Virtual Machine** (`be_vm.c`)
- Register-based VM (not stack-based)
- Optimized for low memory usage
- Handles instruction execution and control flow
### 2. **Parser & Lexer** (`be_parser.c`, `be_lexer.c`)
- One-pass compilation
- Generates bytecode directly
- Error handling and recovery
### 3. **Memory Management** (`be_mem.c`, `be_gc.c`)
- Custom memory allocator
- Mark-sweep garbage collector
- Low memory footprint optimization
### 4. **Type System**
- **Value Types**: int, real, boolean, string (not class objects)
- **Object Types**: list, map, range, class instances
- Optimized for performance vs. pure OOP
### 5. **Security Features** (Recently Added)
- **JSON Security**: Comprehensive buffer overflow protection
- Unicode handling with proper size calculation
- Input validation and sanitization
## Recent Security Work
### JSON Parser Security (`be_jsonlib.c`)
- **Fixed**: Critical buffer overflow in Unicode handling
- **Added**: Comprehensive security tests (10 test functions)
- **Implemented**: Safe string length calculation
- **Protected**: Against memory exhaustion attacks
## Build System
- **Makefile** - Primary build system
- **CMakeLists.txt** - CMake support
- **library.json** - PlatformIO library definition
## Testing Infrastructure
- **51 unit tests** covering all major features
- **Automated test runner** via `make test`
- **Security regression tests** for vulnerability prevention
- **Cross-platform compatibility tests**
## File Statistics
- **Total Source Files**: ~200 files
- **Core C Code**: ~24,000 lines
- **Test Code**: ~15,000 lines
- **Documentation**: Comprehensive README and examples
- **Binary Size**: <40KiB interpreter core
- **Memory Usage**: <4KiB heap minimum
This repository represents a complete, production-ready embedded scripting language with comprehensive testing, security features, and development tools.