📋 SAMPLE REPORT — Real analysis of the "codedictate" project (4,260 LOC, 47 Findings) Get Your Own Report
VibeCodeDoktor

Your Personal Code Guide for codedictate

Report ID: VCD-SAMPLE-codedictate Date: February 28, 2026 Lines of Code: 4,260 Language: Python Tech Stack: Flask, Whisper, SQLAlchemy

codedictate is a Flask-based dictation solution using OpenAI Whisper for speech recognition. The project shows solid foundational architecture but has significant security gaps (hardcoded secrets, missing authentication), insufficient test coverage, and several dead code paths.

Some dependencies are outdated with known vulnerabilities.

Think of this report as a guide, not a grade. Every hint includes a concrete next step — often a single AI prompt is enough.

Your Roadmap

The three most important next steps

1
Unrestricted File Upload Without Validation
Security
Attackers can upload executable code or overwhelm the server with oversized files.
2
Deeply Nested Error Handling (5 Levels)
Complexity
Errors get caught at the wrong level, leading to silent failures and hard-to-find bugs.
3
No Integration Tests for API Endpoints
Tests
Routing errors, wrong HTTP status codes, and serialization issues are only discovered in production.

You're on the right track. Every fix makes your code better for you and for AI. Keep going!

Like what you see?

Your code probably has similar issues. 16 AI agents find them for you — in 15-45 minutes.

Get a report for your project — EUR 19

I analyzed these areas:

Overview

21
Quick Fixes
Small changes, often a single AI prompt
24
Important Improvements
Need a bit more attention, but worth it
2
Strategic Upgrades
Larger refactors for long-term stability

Complexity

Why This Matters

AI assistants pack everything into one big function. Works until you change something — then everything breaks. If a function has >50 lines or >3 nesting levels, ask AI to split it.

MAJOR Medium

Deeply Nested Error Handling (5 Levels)

server/transcribe.py:128
What's Happening

Five nested try/except blocks make the control flow nearly impossible to follow.

What Happens if You Don't Fix This

Errors get caught at the wrong level, leading to silent failures and hard-to-find bugs.

How to Fix
  1. Extract each try/except block into its own function
  2. Use specific exceptions instead of generic ones
  3. Use early-return pattern for error paths
AI Fix Prompt

"In server/transcribe.py starting at line 128, flatten the 5 nested try/except blocks by extracting each into a separate function that raises specific exceptions."

Code
try:
    try:
        try:
            result = whisper.transcribe(chunk)
        except APIError:
            try:
                result = whisper.transcribe(chunk, model="base")
            except:
                ...
What to Remember

Flatten nested error handling by extracting functions. Each function handles one concern and its specific errors.

MAJOR Significant

Circular Import Between app.py and models.py

server/app.py:8
What's Happening

app.py imports models.py, and models.py imports app.py for the db instance. This is worked around with delayed imports, making the code fragile.

What Happens if You Don't Fix This

Any restructuring can lead to ImportError. The code is hard to test because import order is critical.

How to Fix
  1. Move database instance to its own module (db.py)
  2. Both modules import from db.py instead of each other
  3. Introduce Flask Application Factory pattern
AI Fix Prompt

"Create server/db.py exporting the SQLAlchemy db instance. Update server/app.py and server/models.py to import from server/db.py instead of each other."

Code
# server/app.py
from server.models import User, Transcription

# server/models.py
from server.app import db  # circular!
What to Remember

Circular imports indicate poor module boundaries. Extract shared dependencies into a separate module.

MAJOR Medium

Configuration Scattered Across 6 Files

server/config.py:1
What's Happening

Configuration values are spread across config.py, app.py, models.py, transcribe.py, whisper_api.py, and setup.py with sometimes conflicting defaults.

What Happens if You Don't Fix This

Inconsistent configuration leads to hard-to-reproduce bugs, especially between development and production.

How to Fix
  1. Centralize all configuration in config.py
  2. Use environment-dependent config classes (Development, Production, Testing)
  3. Other modules import from config.py
AI Fix Prompt

"Consolidate all configuration into server/config.py with Development/Production/Testing classes. Update all other files to import from config."

Code
# config.py: WHISPER_MODEL = "medium"
# transcribe.py: MODEL = os.getenv("MODEL", "small")  # conflicts!
# whisper_api.py: DEFAULT_MODEL = "base"  # another conflict!
What to Remember

Configuration should live in one place. A single source of truth eliminates configuration drift.

MINOR Significant

Overly Complex Audio Chunking Algorithm

server/audio.py:89
What's Happening

The chunking algorithm implements silence detection, overlap handling, and dynamic chunk sizing in a 120-line function with 8 local variables.

What Happens if You Don't Fix This

Hard to debug when audio quality varies. Tiny changes can dramatically worsen transcription results.

How to Fix
  1. Use pydub or librosa for silence detection
  2. Break algorithm into clearly named sub-steps
  3. Use configurable thresholds instead of magic numbers
AI Fix Prompt

"Refactor chunk_audio() in server/audio.py to use pydub.silence.split_on_silence() instead of the custom implementation. Extract constants to config."

Code
def chunk_audio(audio_data, sr=16000):
    threshold = 0.015  # magic number
    min_silence = int(sr * 0.3)  # magic number
    # ... 120 lines of sliding window logic
What to Remember

Prefer well-tested libraries (pydub, librosa) over custom implementations for common audio processing tasks.

MINOR Medium

UI Module Has Mixed Responsibilities

client/ui.py:1
What's Happening

client/ui.py contains both GUI rendering and business logic (file selection, API calls, error handling) in a 450-line file.

What Happens if You Don't Fix This

Changes to layout can break business logic and vice versa. Unit testing is practically impossible.

How to Fix
  1. Extract API calls to separate client/api.py module
  2. Move business logic to client/controller.py
  3. Use UI module only for rendering and event binding
AI Fix Prompt

"Split client/ui.py into three modules: client/ui.py (rendering only), client/api.py (API calls), client/controller.py (business logic coordination)."

Code
class MainWindow:
    def on_record_click(self):
        # 80 lines mixing UI updates, API calls, and error handling
        self.status_label.config(text="Recording...")
        audio = self.recorder.record()
        response = requests.post(API_URL + "/upload", ...)
What to Remember

Separate presentation from logic. MVC or similar patterns make code testable and maintainable.

CRITICAL Major Effort

God Function: transcribe_and_process (380 lines)

server/transcribe.py:45
What's Happening

A single function handles audio decoding, chunking, Whisper API calls, post-processing, punctuation, and database writes — all in 380 lines.

What Happens if You Don't Fix This

Virtually untestable, extremely error-prone to modify. Any bug fix can have unintended side effects.

How to Fix
  1. Split into dedicated steps: decode_audio, chunk_audio, call_whisper, postprocess, save_result
  2. Make each step independently testable
  3. Implement a pipeline pattern (step by step)
AI Fix Prompt

"Refactor transcribe_and_process() in server/transcribe.py into 5 smaller functions: decode_audio(), chunk_audio(), call_whisper(), postprocess_text(), save_transcription(). Wire them together in a pipeline function."

Code
def transcribe_and_process(audio_file, user_id, language="de"):
    # ... 380 lines of nested logic
    # audio decoding, chunking, API calls, text cleanup, DB writes
What to Remember

Functions over 50 lines are a smell. Over 100 is dangerous. Over 300 is a maintenance nightmare. Split along responsibility boundaries.

INFO Significant

Cyclomatic Complexity of 28 in Request Handler

server/app.py:45
What's Happening

The /transcribe endpoint has 28 different decision paths through nested conditions and error handling.

What Happens if You Don't Fix This

Each change theoretically requires testing 28 paths. The increased regression risk slows down development.

How to Fix
  1. Split handler into middleware chain
  2. Extract validation, processing, and response into separate functions
  3. Use guard clauses instead of deep nesting
AI Fix Prompt

"Refactor the /transcribe endpoint in server/app.py to use guard clauses for early returns and extract validation, processing, and response formatting into separate functions."

Code
@app.route("/transcribe", methods=["POST"])
def transcribe():
    if request.content_type:
        if "multipart" in request.content_type:
            if "audio" in request.files:
                # ... 15 more levels of nesting
What to Remember

Aim for cyclomatic complexity under 10. Use guard clauses (early returns) to flatten nested conditions.

Tests

Why This Matters

In vibe coding, testing is the only guarantee. Every new prompt can break old code. Golden rule: write a test proving current state works BEFORE asking AI to change anything.

MAJOR Significant

No Integration Tests for API Endpoints

server/app.py:1
What's Happening

Not a single API endpoint is tested with HTTP requests. Neither upload, transcription, nor admin routes have integration tests.

What Happens if You Don't Fix This

Routing errors, wrong HTTP status codes, and serialization issues are only discovered in production.

How to Fix
  1. Use Flask test client (app.test_client())
  2. Test happy path + error cases for each endpoint
  3. Validate request/response format and status codes
AI Fix Prompt

"Create tests/test_api.py using Flask test client. Test all routes: GET /health, POST /upload, POST /transcribe, GET /transcriptions, /admin/* with both valid and invalid inputs."

Code
# No integration tests exist. Example of what should be:
# def test_upload_invalid_file(client):
#     response = client.post("/upload", data={"audio": (BytesIO(b"not audio"), "test.exe")})
#     assert response.status_code == 400
What to Remember

Every API endpoint needs at least a happy-path and an error-case integration test using the framework test client.

MAJOR Medium

No Test Fixtures or Factories

tests/test_transcribe.py:1
What's Happening

Test data is created inline without reusable fixtures. Every new test must write its own setup code.

What Happens if You Don't Fix This

Duplicated setup code leads to inconsistent test data and makes tests hard to maintain.

How to Fix
  1. Create pytest fixtures in conftest.py
  2. Factory functions for commonly needed test objects
  3. Provide fixtures for app instance, DB session, test client
AI Fix Prompt

"Create tests/conftest.py with fixtures: app (Flask test app), client (test client), db_session (test database), sample_audio (test audio file). Create tests/factories.py for User and Transcription factories."

Code
# Current: no fixtures, each test duplicates setup
def test_something():
    app = create_app()  # duplicated
    db.create_all()  # duplicated
    user = User(email="test@test.com")  # duplicated
What to Remember

Good test infrastructure (fixtures, factories, helpers) pays for itself within weeks by making tests easy to write and maintain.

MINOR Quick Fix

Mocked Tests Don't Match Real API

tests/test_transcribe.py:12
What's Happening

The only mock for the Whisper API returns a simplified object that doesn't match the actual API response format.

What Happens if You Don't Fix This

Tests pass even though the code would fail with the real API. False confidence.

How to Fix
  1. Align mock responses with real API responses
  2. Use fixture with recorded real API response
  3. Write contract test against API documentation
AI Fix Prompt

"In tests/test_transcribe.py, update the Whisper API mock to return the actual response format including segments, language, and duration fields."

Code
# Mock returns simplified format:
mock_whisper.return_value = {"text": "hello world"}
# Real API returns: {"text": "...", "segments": [...], "language": "de", "duration": 12.5}
What to Remember

Mocks must match the real interface. Record real responses and use them as fixtures to prevent drift.

CRITICAL Major Effort

Only 8% Test Coverage (2 Tests for 4260 LOC)

tests/test_transcribe.py:1
What's Happening

The entire project has only 2 tests in a single test file. Core functionality like upload, transcription, and authentication is untested.

What Happens if You Don't Fix This

Any change can silently break existing functionality. Refactoring becomes a gamble.

How to Fix
  1. Set up test framework (pytest, pytest-flask)
  2. Test at least every API endpoint
  3. Cover critical business logic with unit tests
  4. Set up CI pipeline with test execution
AI Fix Prompt

"Set up pytest with pytest-flask. Create test files: tests/test_api.py (endpoint tests), tests/test_transcribe.py (transcription logic), tests/test_models.py (database operations). Target 60% coverage minimum."

Code
# tests/test_transcribe.py — ENTIRE test suite:
def test_whisper_returns_text():
    assert transcribe("hello.wav") != ""

def test_empty_audio():
    assert transcribe("empty.wav") == ""
What to Remember

Test coverage below 40% means you are flying blind. Prioritize testing critical paths: auth, payment, data persistence.

CRITICAL Medium

Tests Use Production Database

tests/test_transcribe.py:5
What's Happening

Existing tests connect to the same database as production because no test configuration exists.

What Happens if You Don't Fix This

Tests can modify or delete production data. An accidental test run can destroy real user data.

How to Fix
  1. Configure separate test database connection in conftest.py
  2. Use SQLite in-memory for fast unit tests
  3. Create fixtures for test data setup and teardown
AI Fix Prompt

"Create tests/conftest.py with a test database fixture using SQLite in-memory. Update tests to use the fixture instead of importing from server.config directly."

Code
# tests/test_transcribe.py
from server.config import DATABASE_URL  # same as production!
from server.models import db

def test_save_transcription():
    db.session.add(...)  # writes to production DB!
What to Remember

Tests must never touch production databases. Use separate test databases, fixtures, and cleanup.

INFO Medium

No CI/CD Pipeline Configuration

setup.py:1
What's Happening

There is no GitHub Actions, GitLab CI, or any other CI/CD configuration. Tests must be run manually.

What Happens if You Don't Fix This

Without automated test execution, tests are forgotten and code quality erodes silently.

How to Fix
  1. Create GitHub Actions workflow for tests
  2. Run lint check (flake8/ruff) and tests on every push
  3. Enable branch protection rules
AI Fix Prompt

"Create .github/workflows/test.yml that runs pytest on every push and PR. Include ruff linting. Add a badge to README.md."

Code
# No CI/CD configuration files found:
# No .github/workflows/*.yml
# No .gitlab-ci.yml
# No Jenkinsfile
What to Remember

CI/CD is non-negotiable for any team project. Automate linting and tests from day one.

Dead Code

Why This Matters

Dead code accumulates from AI sessions — old approaches left behind. It confuses both you and future AI prompts. Keep code clean: what's not needed gets deleted.

MAJOR Quick Fix

Entire Module server/whisper_api.py is Unused

server/whisper_api.py:1
What's Happening

server/whisper_api.py is not imported by any other module. The Whisper integration is done directly in transcribe.py.

What Happens if You Don't Fix This

Dead module confuses new developers and gets accidentally modified during refactoring.

How to Fix
  1. Delete the module since it is unused
  2. If desired: refactor transcribe.py to actually use whisper_api.py
AI Fix Prompt

"Delete server/whisper_api.py — it is not imported anywhere. Run grep -r "whisper_api" to confirm no references exist."

Code
# server/whisper_api.py — 180 lines, imported by nobody
class WhisperClient:
    def __init__(self, api_key, model="medium"):
        ...
    def transcribe(self, audio_path, language="de"):
        ...
What to Remember

Dead code is not free. It costs attention, creates confusion, and can introduce bugs when accidentally modified.

MAJOR Quick Fix

14 Unused Imports Across 5 Files

server/app.py:3
What's Happening

14 imported modules or functions are never used: json, sys, re in app.py, hashlib and hmac in auth.py, among others.

What Happens if You Don't Fix This

Unused imports slow down startup, increase memory footprint, and obscure real dependencies.

How to Fix
  1. Use ruff or autoflake to automatically remove unused imports
  2. Set up isort for consistent import sorting
  3. Set up pre-commit hook for import checks
AI Fix Prompt

"Run ruff check --select F401 --fix server/ to auto-remove all unused imports. Then run ruff check --select I --fix server/ to sort remaining imports."

Code
import json  # unused
import sys  # unused
import re  # unused
from flask import Flask, request, jsonify, redirect  # redirect unused
What to Remember

Use an auto-formatter (ruff, autoflake) to catch unused imports. Configure as a pre-commit hook to prevent accumulation.

MINOR Quick Fix

Commented-Out Feature: WebSocket Streaming

server/app.py:245
What's Happening

65 lines of commented-out WebSocket code for real-time streaming block readability without benefit.

What Happens if You Don't Fix This

Commented-out code is never re-enabled but confuses developers and complicates code reviews.

How to Fix
  1. Delete the commented-out code
  2. If the feature is planned: document as issue/ticket
  3. Git history preserves the code anyway
AI Fix Prompt

"Delete the commented-out WebSocket streaming code at server/app.py lines 245-310. Create a GitHub issue "Implement real-time WebSocket streaming" if the feature is still planned."

Code
# TODO: WebSocket streaming (v2)
# @socketio.on("audio_stream")
# def handle_stream(data):
#     chunk = data["chunk"]
#     # ... 60 more commented lines
What to Remember

Delete commented-out code. Git preserves history. Dead comments are noise, not documentation.

MINOR Quick Fix

Unreachable Code After Early Return

server/transcribe.py:290
What's Happening

After a return statement at line 288, there are 12 lines of code that can never execute.

What Happens if You Don't Fix This

Developers might assume this code executes and introduce bugs based on false assumptions.

How to Fix
  1. Delete the unreachable code
  2. Check whether the early return is correct or whether the code should be reachable
AI Fix Prompt

"Delete the unreachable code at server/transcribe.py lines 290-302 (after the return at line 288). Verify the return at 288 is intentional."

Code
    return result  # line 288

    # Dead code below — never executes:
    logger.info("Post-processing complete")
    stats.record_transcription(len(result))
    cache.set(cache_key, result)
What to Remember

Unreachable code after return statements is a common mistake. Use a linter rule to catch it automatically.

INFO Quick Fix

Legacy Database Migration Script Still Present

server/utils.py:78
What's Happening

A function migrate_v1_to_v2() in utils.py was written for a one-time data migration and is never called again.

What Happens if You Don't Fix This

Minimal risk, but increases cognitive load when reading utils.py.

How to Fix
  1. Delete the function or move to a separate migrations directory
AI Fix Prompt

"Delete the migrate_v1_to_v2() function at server/utils.py lines 78-120. It was a one-time migration and is no longer called."

Code
def migrate_v1_to_v2():
    """One-time migration from v1 schema to v2. Run once, then delete."""
    # ... 40 lines of migration logic
    # Last run: 2024-08-15
What to Remember

One-time scripts should live in a separate directory (e.g., migrations/) or be deleted after execution.

Code Quality

Why This Matters

AI repeats itself between prompts. Inconsistent naming, duplicate functions. Each issue is small but together code becomes unmaintainable. Check regularly.

MAJOR Quick Fix

Bare except Catches SystemExit and KeyboardInterrupt

server/transcribe.py:198
What's Happening

except: without a specific exception catches everything including SystemExit, KeyboardInterrupt, and MemoryError.

What Happens if You Don't Fix This

Process cannot be cleanly terminated. Severe system errors are swallowed and go unnoticed.

How to Fix
  1. Replace except: with except Exception:
  2. Even better: catch specific exceptions (requests.Timeout, openai.APIError)
  3. Add logging for caught exceptions
AI Fix Prompt

"In server/transcribe.py, replace all bare except: clauses with except Exception as e: and add logging.exception("...") calls."

Code
try:
    result = process_audio(chunk)
except:  # catches EVERYTHING
    result = ""  # silently returns empty string
What to Remember

Never use bare except:. Always catch specific exceptions or at minimum except Exception.

MAJOR Significant

Global Mutable State in Module Scope

server/transcribe.py:12
What's Happening

Multiple mutable global variables (active_jobs, cache_dict, stats_counter) are shared between requests without synchronization.

What Happens if You Don't Fix This

Race conditions under load can lead to data loss, corrupt counters, or cache inconsistencies.

How to Fix
  1. Replace global variables with thread-safe alternatives (threading.Lock)
  2. Use Redis for shared state
  3. Use Flask context or dependency injection
AI Fix Prompt

"Replace global mutable state in server/transcribe.py with thread-safe alternatives: use threading.Lock for active_jobs, use Flask-Caching for cache_dict, use Redis or atomic operations for stats_counter."

Code
active_jobs = {}  # mutable global, shared across threads
cache_dict = {}  # mutable global, no TTL, no size limit
stats_counter = {"total": 0, "errors": 0}  # race condition!
What to Remember

Global mutable state is the enemy of concurrent code. Use thread-safe structures or external state stores.

MINOR Quick Fix

Magic Numbers in Audio Processing

server/audio.py:34
What's Happening

Numeric values like 16000, 0.015, 0.3, 512 are used directly in code without explanatory constants.

What Happens if You Don't Fix This

Hard to understand what the numbers mean. Changes require find-and-replace across multiple files.

How to Fix
  1. Define named constants (SAMPLE_RATE, SILENCE_THRESHOLD, MIN_SILENCE_DURATION)
  2. Centralize constants in config.py
AI Fix Prompt

"Extract magic numbers in server/audio.py to named constants: SAMPLE_RATE = 16000, SILENCE_THRESHOLD = 0.015, MIN_SILENCE_SECONDS = 0.3, FFT_SIZE = 512."

Code
audio = audio.set_frame_rate(16000)  # what is 16000?
if amplitude < 0.015:  # what threshold?
    if silence_frames > int(16000 * 0.3):  # ??
What to Remember

Name every magic number. SAMPLE_RATE = 16000 is self-documenting; 16000 alone is not.

MINOR Medium

Inconsistent Error Response Format

server/app.py:45
What's Happening

Different endpoints return errors in different formats: sometimes {"error": "..."}, sometimes {"message": "..."}, sometimes plain text.

What Happens if You Don't Fix This

Clients must handle different error formats, leading to unreliable error display.

How to Fix
  1. Create unified error handler with Flask errorhandler()
  2. Define consistent format: {"error": {"code": "...", "message": "..."}}
  3. Update all endpoints to consistent format
AI Fix Prompt

"Create an error handler in server/app.py using @app.errorhandler that returns {"error": {"code": status_code, "message": str(error)}} for all error responses."

Code
# Endpoint A:
return jsonify({"error": "File too large"}), 400
# Endpoint B:
return jsonify({"message": "Invalid format"}), 422
# Endpoint C:
return "Server error", 500
What to Remember

Standardize error responses across your API. Clients should parse errors the same way everywhere.

MINOR Quick Fix

No Input Length Validation on Text Fields

server/models.py:23
What's Happening

Text fields like title and notes accept arbitrarily long strings without length restriction at the application level.

What Happens if You Don't Fix This

Extremely long inputs can stress the database and break UI elements.

How to Fix
  1. Define maximum length at model level with String(255)
  2. Validate before saving
  3. Set maxlength attribute on client side
AI Fix Prompt

"Add length constraints to server/models.py: title = Column(String(200), nullable=False), notes = Column(String(5000)). Add validation in the route handlers."

Code
class Transcription(db.Model):
    title = db.Column(db.String)  # no length limit
    notes = db.Column(db.Text)  # no length limit
What to Remember

Always define maximum lengths for user-provided text fields. Unbounded input is a security and stability risk.

CRITICAL Medium

No Error Handling on Whisper API Calls

server/transcribe.py:156
What's Happening

API calls to the Whisper service have no timeout, no retry, and no specific error handling. A 500 or timeout crashes the entire request handler.

What Happens if You Don't Fix This

Transient API errors cause complete transcription failure. Users lose their recording without an error message.

How to Fix
  1. Set timeout for API calls (e.g., 30 seconds)
  2. Implement retry logic with exponential backoff
  3. Catch specific errors and return user-friendly messages
AI Fix Prompt

"Wrap Whisper API calls in server/transcribe.py with tenacity retry decorator: @retry(stop=stop_after_attempt(3), wait=wait_exponential(min=1, max=10)). Add timeout=30 to requests."

Code
response = openai.audio.transcriptions.create(
    model=WHISPER_MODEL,
    file=audio_chunk,
    language=language
)  # no timeout, no retry, no error handling
What to Remember

All external API calls need timeout, retry, and error handling. Assume the network will fail.

INFO Medium

Logging Uses print() Instead of logging Module

server/transcribe.py:42
What's Happening

The entire project uses print() for output instead of the Python logging module. 47 print() calls spread across 6 files.

What Happens if You Don't Fix This

No log levels, no structured output, no ability to filter logs in production or send to log aggregators.

How to Fix
  1. Configure Python logging module
  2. Replace print() with logger.info/warning/error
  3. Set up structured logging (JSON) for production
AI Fix Prompt

"Replace all print() calls with proper logging: import logging; logger = logging.getLogger(__name__); replace print("Error:...") with logger.error("..."), print("Processing...") with logger.info("...")."

Code
print(f"Processing file: {filename}")  # should be logger.info
print(f"ERROR: {str(e)}")  # should be logger.error
print(f"Transcription complete in {elapsed}s")  # should be logger.info
What to Remember

Use the logging module from day one. Structured logs with levels are essential for production debugging.

Documentation

Why This Matters

Documentation bridges you and your future self (and the AI). Missing docs = most common reason AI projects get abandoned. AI can write them — just ask.

MAJOR Medium

README Has Only Project Title

README.md:1
What's Happening

The README.md contains only "# codedictate" and a one-line description. Installation, usage, API documentation, and architecture are missing.

What Happens if You Don't Fix This

New developers cannot set up or understand the project without reading the code.

How to Fix
  1. Add sections: Installation, Quickstart, API Documentation, Architecture
  2. Document system requirements (Python version, FFmpeg, etc.)
  3. Provide example commands for setup and start
AI Fix Prompt

"Expand README.md with sections: ## Features, ## Requirements (Python 3.10+, FFmpeg, OpenAI API key), ## Installation, ## Quick Start, ## API Endpoints, ## Configuration, ## Testing."

Code
# codedictate
Voice-to-text dictation tool.
What to Remember

A good README is the most cost-effective documentation. It saves hours of onboarding time per developer.

MINOR Significant

No API Documentation or Schema

server/app.py:1
What's Happening

None of the 8 API endpoints are documented. Neither OpenAPI/Swagger nor inline docstrings describe request/response formats.

What Happens if You Don't Fix This

Frontend developers must read the backend code to understand the API. Integration becomes guesswork.

How to Fix
  1. Install flask-openapi3 or flasgger for automatic API docs
  2. At minimum add docstrings with input/output per route
  3. Maintain OpenAPI spec as YAML
AI Fix Prompt

"Add docstrings to all route handlers in server/app.py with format: Args (JSON body fields), Returns (response format), Raises (error cases). Consider adding flask-openapi3."

Code
@app.route("/transcribe", methods=["POST"])
def transcribe():
    # No docstring, no type hints, no documentation
    file = request.files["audio"]
    ...
What to Remember

Document your API at least with docstrings. For public APIs, use OpenAPI/Swagger for interactive documentation.

MINOR Medium

Missing Docstrings on 23 of 28 Functions

server/transcribe.py:1
What's Happening

82% of functions in the project have no docstring. Only 5 of 28 functions describe what they do, what they expect, and what they return.

What Happens if You Don't Fix This

Developers must read the implementation to understand the interface. Significantly increases onboarding time.

How to Fix
  1. Add docstrings to all public functions
  2. Format: short description, Args, Returns, Raises
  3. Most important: public API functions and complex logic
AI Fix Prompt

"Add Google-style docstrings to all public functions in server/transcribe.py, server/audio.py, and server/models.py. Include Args, Returns, and Raises sections."

Code
def chunk_audio(audio_data, sr=16000):
    # No docstring — what does it return? What format is audio_data?
    ...

def postprocess(text, language):
    # No docstring — what postprocessing? What languages supported?
    ...
What to Remember

Docstrings are the contract between functions. They should answer: what does it do, what does it need, what does it return.

MINOR Quick Fix

No Environment Variable Documentation

server/config.py:1
What's Happening

There is neither a .env.example nor documentation of which environment variables are required.

What Happens if You Don't Fix This

New developers or deployments fail because required variables are not set.

How to Fix
  1. Create .env.example with all required variables
  2. Comments with descriptions and example values
  3. Reference .env.example in README
AI Fix Prompt

"Create .env.example with all environment variables used in server/config.py: OPENAI_API_KEY, DATABASE_URL, FLASK_SECRET_KEY, FLASK_DEBUG, WHISPER_MODEL. Add descriptions as comments."

Code
# config.py uses these but nobody documents them:
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")  # required but undocumented
DATABASE_URL = os.getenv("DATABASE_URL", "sqlite:///local.db")
SECRET_KEY = os.getenv("FLASK_SECRET_KEY", "dev-key-change-me")
What to Remember

A .env.example file is cheap insurance. It documents requirements and prevents deployment failures.

INFO Medium

No Architecture Decision Records

README.md:1
What's Happening

No documentation about why certain technology decisions were made (Whisper model choice, SQLite vs. PostgreSQL, client architecture).

What Happens if You Don't Fix This

Future developers repeat evaluations or change decisions without knowing the context.

How to Fix
  1. Create docs/adr/ directory
  2. Write ADR for each significant decision (lightweight format)
  3. Document context, decision, consequences
AI Fix Prompt

"Create docs/adr/001-whisper-model-selection.md and docs/adr/002-sqlite-for-development.md using the lightweight ADR template (Context, Decision, Consequences)."

Code
# No architecture documentation found.
# Questions that remain unanswered:
# - Why Whisper medium model instead of large?
# - Why SQLite instead of PostgreSQL?
# - Why desktop client instead of web-only?
What to Remember

Architecture Decision Records (ADRs) prevent repeating past discussions. Even one-paragraph ADRs save future time.

Best Practices

Why This Matters

AI knows patterns but doesn't always follow them. It mixes async/await with .then(), ignores conventions. Tell the AI which patterns your project uses — it will follow them.

MAJOR Quick Fix

No .gitignore Entries for Sensitive Files

.gitignore:1
What's Happening

The .gitignore file has no entries for .env, *.pem, *.key, or upload directories. Sensitive files could be accidentally committed.

What Happens if You Don't Fix This

An accidental git add . commits API keys, certificates, and user data to the repository.

How to Fix
  1. Add .env, *.pem, *.key, uploads/, *.sqlite to .gitignore
  2. Use gitignore.io template for Python/Flask
  3. Install git-secrets as pre-commit hook
AI Fix Prompt

"Update .gitignore to include: .env, .env.*, *.pem, *.key, uploads/, *.sqlite, *.db, __pycache__/, .pytest_cache/, htmlcov/."

Code
# .gitignore (incomplete):
__pycache__/
*.pyc
# Missing: .env, *.pem, *.key, uploads/, *.sqlite
What to Remember

Start every project with a comprehensive .gitignore. Use gitignore.io or GitHub templates as a baseline.

MAJOR Quick Fix

No Virtual Environment Configuration

setup.py:1
What's Happening

No documentation or configuration for virtual environments. Developers might install dependencies globally.

What Happens if You Don't Fix This

Global installations lead to version conflicts between projects and make builds non-reproducible.

How to Fix
  1. Provide Makefile or setup.sh with venv creation
  2. Document in README: python -m venv .venv && source .venv/bin/activate
  3. Add .venv/ to .gitignore
AI Fix Prompt

"Create a Makefile with: setup target (creates venv, installs deps), run target (starts server), test target (runs tests). Add .venv/ to .gitignore."

Code
# No Makefile, no setup script, no venv documentation
# Developers are expected to... guess?
# pip install -r requirements.txt  # global? venv? who knows!
What to Remember

Always use virtual environments. Document the setup process. Make it one command.

MINOR Quick Fix

No Health Check Endpoint

server/app.py:1
What's Happening

No /health or /readiness endpoint for monitoring, load balancers, or container orchestration.

What Happens if You Don't Fix This

Load balancers and monitoring tools cannot check the application state.

How to Fix
  1. Add GET /health endpoint that checks DB connection and API reachability
  2. HTTP 200 when healthy, 503 when not
  3. Use Docker HEALTHCHECK directive
AI Fix Prompt

"Add a /health endpoint to server/app.py that checks: database connectivity (SELECT 1), returns {"status": "healthy", "db": "ok"} or 503 with details."

Code
# No health check endpoint exists.
# Docker and monitoring have no way to check if the app is running correctly.
What to Remember

Health check endpoints are essential for production. They enable automated recovery and monitoring.

CRITICAL Quick Fix

Secret Key Uses Predictable Default

server/config.py:8
What's Happening

The Flask SECRET_KEY has a hardcoded fallback "dev-key-change-me" that gets used in production when the environment variable is not set.

What Happens if You Don't Fix This

With a known secret key, attackers can sign session cookies and gain admin access.

How to Fix
  1. Remove fallback — application should not start without SECRET_KEY
  2. Use secrets.token_hex(32) for production
  3. Add startup check that aborts on missing key
AI Fix Prompt

"In server/config.py, change SECRET_KEY to raise an error if not set: SECRET_KEY = os.environ["FLASK_SECRET_KEY"] # no fallback, must be set."

Code
SECRET_KEY = os.getenv("FLASK_SECRET_KEY", "dev-key-change-me")  # predictable!
What to Remember

Never provide default values for security-critical configuration. Fail loudly instead of running insecurely.

INFO Quick Fix

No Automated Code Formatting

setup.py:1
What's Happening

No code formatter (Black, Ruff) configured. The code has inconsistent indentation, line lengths, and string quoting.

What Happens if You Don't Fix This

Inconsistent style creates unnecessary diff noise in code reviews and slows down reading.

How to Fix
  1. Configure ruff format (pyproject.toml)
  2. Set up pre-commit hook for automatic formatting
  3. Run once on entire project
AI Fix Prompt

"Add ruff configuration to pyproject.toml: [tool.ruff] line-length = 100. Create .pre-commit-config.yaml with ruff check and ruff format hooks. Run ruff format . on the entire project."

Code
# Inconsistent style throughout:
some_var="no spaces"  # line 12
other_var = "with spaces"  # line 13
very_long_function_call(argument1, argument2, argument3, argument4, argument5)  # 120+ chars
What to Remember

Pick a formatter (ruff, black), configure it once, never argue about style again.

Dependencies

Why This Matters

AI loves adding packages — sometimes unnecessarily. Every dependency is a risk. Always ask if a native solution exists. Fewer deps = fewer attack vectors.

MAJOR Medium

Unpinned Transitive Dependencies

requirements.txt:1
What's Happening

Only 8 direct dependencies are pinned but their transitive dependencies are not. pip install can install different versions on different machines.

What Happens if You Don't Fix This

Builds are not reproducible. "Works on my machine" becomes the standard problem.

How to Fix
  1. Use pip-compile (pip-tools) to generate a complete requirements.txt
  2. Or: use Poetry/PDM with lock file
  3. Commit requirements.txt and lock file to Git
AI Fix Prompt

"Install pip-tools (pip install pip-tools). Create requirements.in with direct deps. Run pip-compile requirements.in to generate a fully pinned requirements.txt with hashes."

Code
# requirements.txt — only direct deps pinned:
Flask==2.2.3
openai==1.6.1
SQLAlchemy==2.0.25
# But what version of Jinja2? Markupsafe? Click? Unknown!
What to Remember

Always pin all dependencies including transitive ones. Use pip-compile, Poetry, or PDM for reproducible builds.

MINOR Quick Fix

No Dependency License Audit

requirements.txt:1
What's Happening

No check whether the licenses of the 8 dependencies are compatible with the planned licensing model.

What Happens if You Don't Fix This

If the project is distributed commercially, GPL-licensed dependencies could cause legal issues.

How to Fix
  1. Install and run pip-licenses
  2. Check compatibility with planned licensing model
  3. Document results in LICENSES.md
AI Fix Prompt

"Run pip-licenses --format=table to audit all dependency licenses. Create LICENSES.md documenting the findings."

Code
# Unknown licenses in dependency tree:
# Flask — BSD-3-Clause (OK)
# openai — Apache-2.0 (OK)
# But what about transitive deps?
What to Remember

Audit dependency licenses early, especially if you plan to distribute commercially.

CRITICAL Medium

Flask 2.2.3 Has Known Security Vulnerability (CVE-2023-30861)

requirements.txt:1
What's Happening

Flask 2.2.3 is vulnerable to session cookie manipulation (CVE-2023-30861). The current version is 3.1.x.

What Happens if You Don't Fix This

Attackers can manipulate session cookies and impersonate other users.

How to Fix
  1. Update Flask to >= 3.0.0
  2. Review breaking changes in Flask 3.0 migration guide
  3. Run all tests after update
AI Fix Prompt

"In requirements.txt, update Flask from 2.2.3 to 3.1.0. Review the Flask 3.0 migration guide for breaking changes. Run tests after update."

Code
# requirements.txt
Flask==2.2.3  # CVE-2023-30861: session cookie vulnerability
Werkzeug==2.2.3  # also outdated, update together
What to Remember

Run pip-audit or safety check regularly. Pinned versions require active maintenance to stay secure.

INFO Quick Fix

setup.py Uses Deprecated Practices

setup.py:1
What's Happening

The project uses setup.py instead of pyproject.toml. setup.py is being superseded by pyproject.toml per PEP 517/518.

What Happens if You Don't Fix This

Minimal risk short-term, but future tooling will assume pyproject.toml.

How to Fix
  1. Migrate setup.py to pyproject.toml
  2. Switch build system to setuptools with pyproject.toml
AI Fix Prompt

"Convert setup.py to pyproject.toml following PEP 621. Use [build-system] requires = ["setuptools>=68.0"]. Move all metadata to [project] table."

Code
# setup.py (deprecated pattern):
from setuptools import setup
setup(
    name="codedictate",
    version="0.1.0",
    install_requires=[...]
)
What to Remember

Use pyproject.toml for new Python projects. It is the modern standard per PEP 517/518/621.

Security

Why This Matters

AI-generated code often contains security holes — hardcoded keys, missing validation, SQL concatenation. These are invisible: the code "works" but is like an unlocked front door. For every AI-generated code, check input validation and whether secrets ended up in the source.

MAJOR Medium

Unrestricted File Upload Without Validation

server/app.py:67
What's Happening

The audio upload endpoint accepts any file without checking file type, size, or content.

What Happens if You Don't Fix This

Attackers can upload executable code or overwhelm the server with oversized files.

How to Fix
  1. Validate allowed file types (WAV, MP3, FLAC, OGG)
  2. Enforce maximum file size (e.g., 50 MB)
  3. Check MIME type and magic bytes
AI Fix Prompt

"Add file validation to the /upload endpoint in server/app.py: check file extension against ALLOWED_EXTENSIONS, enforce MAX_CONTENT_LENGTH, and verify magic bytes."

Code
@app.route("/upload", methods=["POST"])
def upload_audio():
    file = request.files["audio"]
    file.save(os.path.join(UPLOAD_DIR, file.filename))
What to Remember

Always validate file uploads: check type, size, and content. Never trust the client-provided filename.

MAJOR Quick Fix

Path Traversal via User-Controlled Filename

server/app.py:71
What's Happening

The filename is taken directly from the user without using secure_filename() or similar sanitization.

What Happens if You Don't Fix This

Attackers can write files to arbitrary directories (e.g., ../../etc/crontab).

How to Fix
  1. Use werkzeug.utils.secure_filename()
  2. Generate your own UUIDs as filenames
  3. Secure the upload directory with permissions
AI Fix Prompt

"In server/app.py line 71, replace file.filename with secure_filename(file.filename) from werkzeug.utils, or better yet, generate a UUID filename."

Code
file.save(os.path.join(UPLOAD_DIR, file.filename))
What to Remember

Never use user-supplied filenames directly. Use secure_filename() or generate unique names server-side.

MINOR Quick Fix

CORS Allows All Origins

server/app.py:18
What's Happening

CORS is configured with origins="*", allowing requests from any domain.

What Happens if You Don't Fix This

Third-party websites can make API requests on behalf of authenticated users.

How to Fix
  1. Restrict allowed origins to your own domain
  2. Allow localhost in development, only your domain in production
AI Fix Prompt

"In server/app.py line 18, replace CORS(app, origins="*") with CORS(app, origins=os.environ.get("ALLOWED_ORIGINS", "http://localhost:3000").split(","))."

Code
CORS(app, origins="*")
What to Remember

Restrict CORS to the minimum necessary origins. Wildcard origins bypass the same-origin policy entirely.

CRITICAL Quick Fix

Hardcoded API Key in Source Code

server/config.py:14
What's Happening

The OpenAI API key is hardcoded directly in the source code and gets committed to the repository with every push.

What Happens if You Don't Fix This

Attackers can extract the API key from git history and make API calls at your expense.

How to Fix
  1. Move API key to environment variables
  2. Create .env file and add to .gitignore
  3. Rotate the existing key immediately
AI Fix Prompt

"Replace the hardcoded OPENAI_API_KEY in server/config.py with os.environ.get("OPENAI_API_KEY") and add a .env.example file."

Code
OPENAI_API_KEY = "sk-proj-abc123def456ghi789"
What to Remember

Never commit API keys or secrets to version control. Use environment variables or a secrets manager.

CRITICAL Medium

SQL Injection via Raw Query

server/models.py:87
What's Happening

User input is directly interpolated into a SQL query without parameterization or escaping.

What Happens if You Don't Fix This

Attackers can execute arbitrary SQL commands, steal data, or drop the entire database.

How to Fix
  1. Replace string interpolation with parameterized queries
  2. Use SQLAlchemy ORM methods instead of raw SQL
  3. Add input validation as an additional layer of defense
AI Fix Prompt

"In server/models.py line 87, replace the f-string SQL query with a parameterized SQLAlchemy query using bindparams or ORM methods."

Code
db.execute(f"SELECT * FROM transcriptions WHERE user_id = '{user_id}' AND title LIKE '%{search}%'")
What to Remember

Always use parameterized queries. Never interpolate user input into SQL strings.

CRITICAL Significant

Missing Authentication on Admin Endpoints

server/app.py:203
What's Happening

Admin routes (/admin/users, /admin/stats) are accessible without any authentication whatsoever.

What Happens if You Don't Fix This

Anyone can view user data, delete accounts, and modify system settings.

How to Fix
  1. Implement authentication middleware
  2. Add JWT or session-based auth for admin routes
  3. Introduce role-based access control (RBAC)
AI Fix Prompt

"Add a @require_admin decorator to all /admin/* routes in server/app.py. Implement JWT-based authentication in server/auth.py."

Code
@app.route("/admin/users")
def admin_users():
    users = User.query.all()
    return jsonify([u.to_dict() for u in users])
What to Remember

Every admin endpoint must require authentication and authorization. Defense in depth means checking at every layer.

CRITICAL Quick Fix

Flask Debug Mode Enabled in Production Config

server/app.py:312
What's Happening

The application starts with debug=True, enabling the interactive debugger and code reload in production.

What Happens if You Don't Fix This

The Werkzeug debugger allows remote code execution. Attackers can run arbitrary Python code on your server.

How to Fix
  1. Control debug=True via environment variable
  2. Set FLASK_DEBUG=0 in production
  3. Use Gunicorn or uWSGI as production WSGI server
AI Fix Prompt

"In server/app.py line 312, replace app.run(debug=True) with app.run(debug=os.environ.get("FLASK_DEBUG", "0") == "1")."

Code
if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5000, debug=True)
What to Remember

Never enable debug mode in production. It exposes an interactive debugger that allows arbitrary code execution.

INFO Medium

No Rate Limiting on API Endpoints

server/app.py:1
What's Happening

None of the API endpoints have rate limiting implemented, neither for authentication nor for resource-intensive operations.

What Happens if You Don't Fix This

Attackers can perform brute-force attacks or overwhelm the server with mass requests.

How to Fix
  1. Install and configure Flask-Limiter
  2. Set strict limits for auth endpoints (e.g., 5/minute)
  3. Set moderate limits for upload endpoints (e.g., 10/hour)
AI Fix Prompt

"Add Flask-Limiter to server/app.py. Apply @limiter.limit("5/minute") to auth endpoints and @limiter.limit("10/hour") to /upload."

Code
# No rate limiting configuration found anywhere in the project
What to Remember

Rate limiting is essential for all public-facing APIs, especially auth and file upload endpoints.

Ready for your own report?

47 findings in this project. How many does yours have?

Analyze your code — EUR 19 Back to homepage

One-time EUR 19 incl. VAT · No subscription · Report in 15-45 min. via email