How I built a Milvus ALTER command in Django (before native support existed)

A developer at a company using Milvus as their vector database built a custom ALTER command in Django to change collection schemas before Milvus had native support. The command creates a new collection and migrates data, handling field additions and removals automatically. It is packaged as a reusable Django management command for safe team use.

At my company we use Milvus as our vector database. We had multiple collections in production with customer data. At some point we needed to change the schemas add new fields and remove old one without losing any of that data. The problem was simple: Milvus had no native ALTER command at the time. The only official workaround mentioned in their GitHub issues was to create a new collection and migrate the data yourself. So that's exactly what I did, but wrapped it into a reusable Django management command so anyone on the team could run it safely. Nothing fancy. The command takes four things: It then does this: Removals are handled automatically if a field isn't in the new schema, it just doesn't get copied. Additions get a default value assigned based on their datatype. You can also pass your own default. One thing it doesn't support is updating existing values in a field. That would need some changes to existing script but we never needed it so I left it out. You pass the new schema as a list of field definitions: schema = {"field name": "file id", "datatype": "VARCHAR", "max length": 500}, {"field name": "vector", "datatype": "FLOAT VECTOR", "dim": 1024}, {"field name": "text", "datatype": "VARCHAR", "max length": 65535}, {"field name": "metadata", "datatype": "JSON"}, {"field name": "page num", "datatype": "VARCHAR", "max length": 500}, {"field name": "name", "datatype": "VARCHAR", "max length": 500} The id primary key is handled automatically you don't need to include it. You can also assign a default here. python from pymilvus import MilvusClient, DataType class MilvusManager: def init self, database : self.client = MilvusClient uri="", token="" self.client.use database database def create schema self : schema = self.client.create schema schema.add field field name="id", datatype=DataType.INT64, is primary=True, auto id=True return schema def create index self : index params = self.client.prepare index params return index params def create collection self, collection name : if self.client.has collection collection name : return return self.client.create collection collection name=collection name, schema=self.create schema , index params=self.create index , def drop collection self, collection name : if not self.client.has collection collection name : return return self.client.drop collection collection name=collection name def insert row self, collection name, data : if not self.client.has collection collection name : self.create collection collection name return self.client.insert collection name=collection name, data=data def delete rows self, collection name, filter expr : self.client.load collection collection name=collection name results = self.client.delete collection name=collection name, filter=filter expr self.client.release collection collection name=collection name return results def query self, collection name, filter, output fields=None : if not output fields: output fields = " " self.client.load collection collection name=collection name data = self.client.query collection name=collection name, filter=filter, output fields=output fields self.client.release collection collection name=collection name return data def upsert self, collection name, data : self.client.load collection collection name=collection name self.client.upsert collection name=collection name, data=data self.client.release collection collection name=collection name def close self : self.client.close Quick Note : It was created for our internal usecase. python import logging import numpy as np from milvus manager import MilvusManager from pymilvus import DataType logger = logging.getLogger "django" def get field schema client, new schema : """ Convert the old schema JSON to milvus Collection Schema """ logger.info "Creating new schema..." schema = client.create schema type = {x.name: x.value for x in list DataType } for i, x in enumerate new schema : typ = x.get "datatype" val = type.get typ.upper if not val: logger.info f"Invalid datatype at pos {i} {x}" raise ValueError "Invalid datatype. Check logs for more info" x "datatype" = DataType val try: schema.add field x except: logger.info f"Invalid parameter at pos {i} {x}" raise schema.add field field name="id", datatype=DataType.INT64, is primary=True, auto id=True return schema def get clean fields fields : for x in fields: x "type" = x "type" .name if "default value" in x: x "default value" = x "default value" .ListFields 0 -1 return fields def get fields milvus manager, collection name : """ Get the field names ffrom old collection """ old schema fields = milvus manager.client.describe collection collection name "fields" old schema fields = field for field in old schema fields if field "name" = "id" field names = {f "name" for f in old schema fields} return old schema fields, field names def convert new schema new schema, output fields : """ Add default values to the new schema """ fields = for field in new schema: if field.get "id" : logger.info "Field id detected.. It will be automatically rewritten" continue typ = field.get "datatype" if field.get "field name" in output fields: fields.append field continue if typ: if field.get "default value" : fields.append field continue if typ.lower == "varchar": field "default value" = "" if not field.get "max length" : field "max length" = 65535 elif typ.lower == "bool": field "default value" = True elif typ.lower == "int8": field "default value" = np.int8 0 elif typ.lower == "int16": field "default value" = np.int16 0 elif typ.lower == "int32": field "default value" = np.int32 0 elif typ.lower == "int64": field "default value" = np.int64 0 elif typ.lower == "float": field "default value" = np.float32 3.14 elif typ.lower == "double": field "default value" = np.float64 3.14 fields.append field else: logger.log f"Data type not provided terminating .... {field}" raise ValueError "Datatype not provided..." return fields def create collection milvus manager, new collection, schema : logger.info f"Creating collection {new collection}" index params = milvus manager.create index milvus manager.client.create collection collection name=new collection, schema=schema, index params=index params, enable dynamic field=True, logger.info f"Collection {new collection} created" def get collection iterator milvus manager, collection name, output fields, batch size : logger.info f"Fetching data from {collection name} with batch size {batch size}" milvus manager.client.load collection collection name results = milvus manager.client.query iterator collection name=collection name, filter="id 0", batch size=batch size, output fields=output fields, logger.info "Data fetched" return results def insert into collection milvus manager, new collection, results : milvus manager.client.load collection collection name=new collection logger.info f"Inserting into collection {new collection}" while True: result = results.next if not result: break for x in result: if "id" in x: del x "id" milvus manager.client.insert collection name=new collection, data=result logger.info f"Insertion into collection {new collection} completed" def cleanup milvus manager, collection name, new collection : logger.info f"Dropping old collection {collection name}" milvus manager.drop collection collection name logger.info f"Old collection {collection name} dropped" logger.info f"Collection {new collection} renamed" milvus manager.client.rename collection new collection, collection name def alter collection milvus manager, collection name, new schema, batch size : """ Here we: 1 Drop the new collection if it already exists 2 Compare new fields with old fields and filter out id field 3 Add default values to the new schema 4 Convert the new schema json to collection schema 5 Get the data in batches from old collection and insert into new collection 6 Delete the old collection and rename the new collection. """ new collection = f"{collection name} temp" milvus manager.drop collection new collection old fields, old names = get fields milvus manager, collection name new names = {f "field name" for f in new schema} tmp = new names - old names if len tmp == 1: if list tmp 0 == "id": logger.info "No changes detected terminating..." return if new names == old names: logger.info "No changes detected terminating..." return output fields = list new names.intersection old names new schema = convert new schema new schema, output fields old fields = get clean fields old fields schema = get field schema milvus manager.client, new schema results = get collection iterator milvus manager, collection name, output fields, batch size try: create collection milvus manager, new collection, schema insert into collection milvus manager, new collection, results cleanup milvus manager, collection name, new collection except Exception as e: logger.info f"Got a error for collection {collection name} : {str e } ", exc info=True def main database, collection name, new schema, batch size : """The entrypoint""" milvus = MilvusManager database alter collection milvus, collection name, new schema, batch size After this I used this function inside django managemen command. client.add collection field collection name="my collection", field name="new field", datatype=DataType.VARCHAR, max length=500, nullable=True It does not support removing old field and adding default. Thanks for reading Let's connect: