AI and MySQL Databases

Many AI-generated database schemas default to using VARCHAR(36) for primary key columns to store UUIDs, motivated by security concerns such as preventing sequential ID enumeration. However, this choice introduces significant performance trade-offs, including larger storage size, slower indexing, and increased memory usage compared to integer-based keys. A recommended hybrid approach is to use a BIGINT UNSIGNED primary key for performance while maintaining a separate GUID column for secure external references.

One notable mistake is that many AI-generated database schemas currently favour using VARCHAR 36 for the id column. One notable mistake is that many AI-generated database schemas currently favour using VARCHAR 36 for the id column. This is typically motivated by a desire to prevent unauthorised access by malicious actors. To achieve this, UUIDs Universally Unique Identifiers are often employed as a security measure. Take WordPress as an example — it typically exposes post IDs in the URL like ?p=45 . Compared to an SEO-friendly format such as /2025/05/sample-post , this numeric ID allows users to potentially guess and access adjacent content by altering the number e.g., ?p=44 . UUIDs, on the other hand, are not sequential and are far harder to guess. UUIDs are 36-character strings that follow a standard format with five sections separated by hyphens: 8-4-4-4-12 . Each section is made up of hexadecimal values: 550e8400-e29b-41d4-a716-446655440000 f47ac10b-58cc-4372-a567-0e02b2c3d479 123e4567-e89b-12d3-a456-426614174000 A simple regex to validate UUIDs looks like this: ^ a-fA-F0-9 {8}- a-fA-F0-9 {4}- a-fA-F0-9 {4}- a-fA-F0-9 {4}- a-fA-F0-9 {12}$ While UUIDs offer security and portability, they come with trade-offs. They're significantly larger than integers, which can slow down database performance — especially when used as primary keys in relational databases. Indexing, memory usage, and query speed all suffer due to the size and non-sequential nature of UUIDs. Curious about this design choice, I asked GitHub Copilot the following question: Me: Why have you chosen the id column to be defined as VARCHAR 36 PRIMARY KEY in the database structure? GitHub Copilot: I chose VARCHAR 36 for the ID columns for a few specific reasons: - UUID Storage: The 36-character length fits UUID values, which follow a standardized format. - Multi-Tenant Systems: UUIDs help isolate tenants by eliminating sequential ID leakage, simplifying migrations, and improving security. - Data Portability: UUIDs ensure that exported data doesn't suffer from ID collisions during import into another system. Trade-offs: - Larger storage size 36 bytes vs. 4 bytes for INT - Slower indexing and query performance - Increased memory usage Alternatives to consider: AUTO INCREMENT INT/BIGINT BINARY 16 to store raw UUID bytes more efficiently- A custom ID generation strategy Let me know if you'd prefer a different approach. Me: Use BIGINT UNSIGNED for performance, but keep a guid field in each record for external references. GitHub Copilot: I'll update the schema to use BIGINT UNSIGNED as the primary key while adding aguid column for external references. This hybrid approach combines the best of both strategies: - Performance: Efficient indexing and joins via numeric primary keys - Security & Portability: GUIDs exposed in APIs, making internal IDs safe from enumeration Summary: BIGINT UNSIGNED : Fast, compact, sequentialGUID VARCHAR : Unique, portable, externally safe While relational databases offer flexibility in choosing between UUIDs, integers, or hybrids, NoSQL systems like MongoDB take a different path. MongoDB uses a built-in type called ObjectId for its id field. This is a 12-byte binary value that encodes: Example: 507f1f77bcf86cd799439011 VARCHAR 36 UUID. They're also more index-friendly.Many AI-generated systems default to MongoDB-like NoSQL setups when: In these cases, using ObjectIds or similar unique, time-sortable identifiers is often a sensible default. These architecture decisions are not theoretical; they directly influence how we design and scale platforms like PrimeCRM in real-world environments.