# AI and MySQL Databases

> Source: <https://www.libertas.software/en/knowledge-hub/4/ai-and-mysql-databases>
> Published: 2025-05-16 00:00:00+00:00

One notable mistake is that many AI-generated database schemas currently favour using VARCHAR(36) for the id column.
One notable mistake is that many AI-generated database schemas currently favour using VARCHAR(36)
for the id
column.
This is typically motivated by a desire to prevent unauthorised access by malicious actors. To achieve this, UUIDs (Universally Unique Identifiers) are often employed as a security measure.
Take WordPress as an example — it typically exposes post IDs in the URL like ?p=45
. Compared to an SEO-friendly format such as /2025/05/sample-post
, this numeric ID allows users to potentially guess and access adjacent content by altering the number (e.g., ?p=44
). UUIDs, on the other hand, are not sequential and are far harder to guess.
UUIDs are 36-character strings that follow a standard format with five sections separated by hyphens: 8-4-4-4-12
. Each section is made up of hexadecimal values:
550e8400-e29b-41d4-a716-446655440000
f47ac10b-58cc-4372-a567-0e02b2c3d479
123e4567-e89b-12d3-a456-426614174000
A simple regex to validate UUIDs looks like this:
^[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}$
While UUIDs offer security and portability, they come with trade-offs. They're significantly larger than integers, which can slow down database performance — especially when used as primary keys in relational databases. Indexing, memory usage, and query speed all suffer due to the size and non-sequential nature of UUIDs.
Curious about this design choice, I asked GitHub Copilot the following question:
Me:
Why have you chosen the id
column to be defined as VARCHAR(36) PRIMARY KEY
in the database structure?
GitHub Copilot:
I chose
VARCHAR(36)
for the ID columns for a few specific reasons:
- UUID Storage: The 36-character length fits UUID values, which follow a standardized format.
- Multi-Tenant Systems: UUIDs help isolate tenants by eliminating sequential ID leakage, simplifying migrations, and improving security.
- Data Portability: UUIDs ensure that exported data doesn't suffer from ID collisions during import into another system.
Trade-offs:
- Larger storage size (36 bytes vs. 4 bytes for
INT
)- Slower indexing and query performance
- Increased memory usage
Alternatives to consider:
AUTO_INCREMENT INT/BIGINT
BINARY(16)
to store raw UUID bytes more efficiently- A custom ID generation strategy
Let me know if you'd prefer a different approach.
Me:
Use BIGINT UNSIGNED
for performance, but keep a guid
field in each record for external references.
GitHub Copilot:
I'll update the schema to use
BIGINT UNSIGNED
as the primary key while adding aguid
column for external references. This hybrid approach combines the best of both strategies:
- Performance: Efficient indexing and joins via numeric primary keys
- Security & Portability: GUIDs exposed in APIs, making internal IDs safe from enumeration
Summary:
BIGINT UNSIGNED
: Fast, compact, sequentialGUID (VARCHAR)
: Unique, portable, externally safe
While relational databases offer flexibility in choosing between UUIDs, integers, or hybrids, NoSQL systems like MongoDB take a different path.
MongoDB uses a built-in type called ObjectId
for its _id
field. This is a 12-byte binary value that encodes:
Example:
507f1f77bcf86cd799439011
VARCHAR(36)
UUID. They're also more index-friendly.Many AI-generated systems default to MongoDB-like NoSQL setups when:
In these cases, using ObjectIds or similar unique, time-sortable identifiers is often a sensible default.
These architecture decisions are not theoretical; they directly influence how we design and scale platforms like PrimeCRM in real-world environments.
