Tasmota/lib/libesp32/berry/REPOSITORY_MAP.md

5.5 KiB

Berry Repository Structure Map

Overview

Berry is an ultra-lightweight dynamically typed embedded scripting language designed for lower-performance embedded devices. The interpreter core is less than 40KiB and can run on less than 4KiB heap.

Directory Structure

/src/ - Core Source Code (152 files)

Main Components:

  • Virtual Machine: be_vm.c (1419 lines) - Register-based VM execution
  • Parser: be_parser.c (1841 lines) - One-pass compiler and syntax analysis
  • Lexer: be_lexer.c (914 lines) - Tokenization and lexical analysis
  • API: be_api.c (1179 lines) - External C API interface
  • Code Generation: be_code.c (983 lines) - Bytecode generation
  • Garbage Collector: be_gc.c (613 lines) - Mark-sweep garbage collection

Data Types & Objects:

  • Strings: be_string.c (326 lines), be_strlib.c (1137 lines)
  • Lists: be_list.c (207 lines), be_listlib.c (556 lines)
  • Maps: be_map.c (354 lines), be_maplib.c (265 lines)
  • Classes: be_class.c (374 lines)
  • Functions: be_func.c (183 lines)
  • Bytes: be_byteslib.c (1992 lines) - Binary data handling

Built-in Libraries:

  • JSON: be_jsonlib.c (645 lines) - JSON parsing/generation
  • Math: be_mathlib.c (438 lines) - Mathematical functions
  • OS: be_oslib.c (271 lines) - Operating system interface
  • File: be_filelib.c (265 lines) - File I/O operations
  • Debug: be_debug.c (418 lines), be_debuglib.c (289 lines)
  • Introspection: be_introspectlib.c (298 lines)
  • Time: be_timelib.c (72 lines)

Memory & Execution:

  • Memory Management: be_mem.c (377 lines)
  • Execution: be_exec.c (531 lines)
  • Bytecode: be_bytecode.c (634 lines)
  • Variables: be_var.c (201 lines)
  • Modules: be_module.c (509 lines)

Headers:

  • Main Header: berry.h (2395 lines) - Primary API definitions
  • Constants: be_constobj.h (505 lines) - Constant object definitions

/tests/ - Unit Tests (54 files)

Core Language Tests:

  • assignment.be, bool.be, class.be, closure.be, function.be
  • for.be, vararg.be, cond_expr.be, exceptions.be

Data Type Tests:

  • list.be, map.be, range.be, string.be, int.be, bytes.be

Library Tests:

  • json.be (9168 lines) - Comprehensive JSON security tests
  • math.be, os.be, debug.be, introspect.be

Parser & Compiler Tests:

  • parser.be, lexer.be, compiler.be, suffix.be

Advanced Feature Tests:

  • virtual_methods.be, super_auto.be, class_static.be
  • division_by_zero.be, reference.be, compound.be

/examples/ - Example Programs (16 files)

  • fib_rec.be - Fibonacci recursion
  • qsort.be - Quick sort implementation
  • bintree.be - Binary tree operations
  • json.be - JSON usage examples
  • repl.be - REPL implementation

/default/ - Default Configuration (17 files)

  • berry_conf.h - Configuration settings
  • be_modtab.c - Module table definitions
  • be_port.c - Platform-specific code
  • berry.c - Main executable entry point

/generate/ - Generated Files (31 files)

  • be_const_strtab.h - String table constants
  • be_fixed_*.h - Fixed/compiled module definitions
  • Auto-generated constant definitions

/tools/ - Development Tools

Code Generation:

  • /coc/ - Compile-on-command tools (13 files)
  • Python scripts for code generation and optimization

Editor Support:

  • /plugins/vscode/ - Visual Studio Code plugin
  • /plugins/Notepad++/ - Notepad++ syntax highlighting

Grammar:

  • berry.ebnf - EBNF grammar definition
  • berry.bytecode - Bytecode format specification

Key Architecture Components

1. Virtual Machine (be_vm.c)

  • Register-based VM (not stack-based)
  • Optimized for low memory usage
  • Handles instruction execution and control flow

2. Parser & Lexer (be_parser.c, be_lexer.c)

  • One-pass compilation
  • Generates bytecode directly
  • Error handling and recovery

3. Memory Management (be_mem.c, be_gc.c)

  • Custom memory allocator
  • Mark-sweep garbage collector
  • Low memory footprint optimization

4. Type System

  • Value Types: int, real, boolean, string (not class objects)
  • Object Types: list, map, range, class instances
  • Optimized for performance vs. pure OOP

5. Security Features (Recently Added)

  • JSON Security: Comprehensive buffer overflow protection
  • Unicode handling with proper size calculation
  • Input validation and sanitization

Recent Security Work

JSON Parser Security (be_jsonlib.c)

  • Fixed: Critical buffer overflow in Unicode handling
  • Added: Comprehensive security tests (10 test functions)
  • Implemented: Safe string length calculation
  • Protected: Against memory exhaustion attacks

Build System

  • Makefile - Primary build system
  • CMakeLists.txt - CMake support
  • library.json - PlatformIO library definition

Testing Infrastructure

  • 51 unit tests covering all major features
  • Automated test runner via make test
  • Security regression tests for vulnerability prevention
  • Cross-platform compatibility tests

File Statistics

  • Total Source Files: ~200 files
  • Core C Code: ~24,000 lines
  • Test Code: ~15,000 lines
  • Documentation: Comprehensive README and examples
  • Binary Size: <40KiB interpreter core
  • Memory Usage: <4KiB heap minimum

This repository represents a complete, production-ready embedded scripting language with comprehensive testing, security features, and development tools.