{"slug": "ai-and-mysql-databases", "title": "AI and MySQL Databases", "summary": "Many AI-generated database schemas default to using VARCHAR(36) for primary key columns to store UUIDs, motivated by security concerns such as preventing sequential ID enumeration. However, this choice introduces significant performance trade-offs, including larger storage size, slower indexing, and increased memory usage compared to integer-based keys. A recommended hybrid approach is to use a BIGINT UNSIGNED primary key for performance while maintaining a separate GUID column for secure external references.", "body_md": "One notable mistake is that many AI-generated database schemas currently favour using VARCHAR(36) for the id column.\nOne notable mistake is that many AI-generated database schemas currently favour using VARCHAR(36)\nfor the id\ncolumn.\nThis is typically motivated by a desire to prevent unauthorised access by malicious actors. To achieve this, UUIDs (Universally Unique Identifiers) are often employed as a security measure.\nTake WordPress as an example — it typically exposes post IDs in the URL like ?p=45\n. Compared to an SEO-friendly format such as /2025/05/sample-post\n, this numeric ID allows users to potentially guess and access adjacent content by altering the number (e.g., ?p=44\n). UUIDs, on the other hand, are not sequential and are far harder to guess.\nUUIDs are 36-character strings that follow a standard format with five sections separated by hyphens: 8-4-4-4-12\n. Each section is made up of hexadecimal values:\n550e8400-e29b-41d4-a716-446655440000\nf47ac10b-58cc-4372-a567-0e02b2c3d479\n123e4567-e89b-12d3-a456-426614174000\nA simple regex to validate UUIDs looks like this:\n^[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}$\nWhile UUIDs offer security and portability, they come with trade-offs. They're significantly larger than integers, which can slow down database performance — especially when used as primary keys in relational databases. Indexing, memory usage, and query speed all suffer due to the size and non-sequential nature of UUIDs.\nCurious about this design choice, I asked GitHub Copilot the following question:\nMe:\nWhy have you chosen the id\ncolumn to be defined as VARCHAR(36) PRIMARY KEY\nin the database structure?\nGitHub Copilot:\nI chose\nVARCHAR(36)\nfor the ID columns for a few specific reasons:\n- UUID Storage: The 36-character length fits UUID values, which follow a standardized format.\n- Multi-Tenant Systems: UUIDs help isolate tenants by eliminating sequential ID leakage, simplifying migrations, and improving security.\n- Data Portability: UUIDs ensure that exported data doesn't suffer from ID collisions during import into another system.\nTrade-offs:\n- Larger storage size (36 bytes vs. 4 bytes for\nINT\n)- Slower indexing and query performance\n- Increased memory usage\nAlternatives to consider:\nAUTO_INCREMENT INT/BIGINT\nBINARY(16)\nto store raw UUID bytes more efficiently- A custom ID generation strategy\nLet me know if you'd prefer a different approach.\nMe:\nUse BIGINT UNSIGNED\nfor performance, but keep a guid\nfield in each record for external references.\nGitHub Copilot:\nI'll update the schema to use\nBIGINT UNSIGNED\nas the primary key while adding aguid\ncolumn for external references. This hybrid approach combines the best of both strategies:\n- Performance: Efficient indexing and joins via numeric primary keys\n- Security & Portability: GUIDs exposed in APIs, making internal IDs safe from enumeration\nSummary:\nBIGINT UNSIGNED\n: Fast, compact, sequentialGUID (VARCHAR)\n: Unique, portable, externally safe\nWhile relational databases offer flexibility in choosing between UUIDs, integers, or hybrids, NoSQL systems like MongoDB take a different path.\nMongoDB uses a built-in type called ObjectId\nfor its _id\nfield. This is a 12-byte binary value that encodes:\nExample:\n507f1f77bcf86cd799439011\nVARCHAR(36)\nUUID. They're also more index-friendly.Many AI-generated systems default to MongoDB-like NoSQL setups when:\nIn these cases, using ObjectIds or similar unique, time-sortable identifiers is often a sensible default.\nThese architecture decisions are not theoretical; they directly influence how we design and scale platforms like PrimeCRM in real-world environments.", "url": "https://wpnews.pro/news/ai-and-mysql-databases", "canonical_source": "https://www.libertas.software/en/knowledge-hub/4/ai-and-mysql-databases", "published_at": "2025-05-16 00:00:00+00:00", "updated_at": "2026-05-22 13:56:20.941518+00:00", "lang": "en", "topics": ["data", "artificial-intelligence", "cybersecurity", "developer-tools"], "entities": ["WordPress", "UUID"], "alternates": {"html": "https://wpnews.pro/news/ai-and-mysql-databases", "markdown": "https://wpnews.pro/news/ai-and-mysql-databases.md", "text": "https://wpnews.pro/news/ai-and-mysql-databases.txt", "jsonld": "https://wpnews.pro/news/ai-and-mysql-databases.jsonld"}}