{"slug": "how-i-built-a-milvus-alter-command-in-django-before-native-support-existed", "title": "How I built a Milvus ALTER command in Django (before native support existed)", "summary": "A developer at a company using Milvus as their vector database built a custom ALTER command in Django to change collection schemas before Milvus had native support. The command creates a new collection and migrates data, handling field additions and removals automatically. It is packaged as a reusable Django management command for safe team use.", "body_md": "At my company we use Milvus as our vector database. We had multiple collections in production with customer data. At some point we needed to change the schemas add new fields and remove old one without losing any of that data.\n\nThe problem was simple: Milvus had no native ALTER command at the time. The only official workaround mentioned in their GitHub issues was to create a new collection and migrate the data yourself. So that's exactly what I did, but wrapped it into a reusable Django management command so anyone on the team could run it safely.\n\nNothing fancy. The command takes four things:\n\nIt then does this:\n\nRemovals are handled automatically if a field isn't in the new schema, it just doesn't get copied. Additions get a default value assigned based on their datatype. You can also pass your own default.\n\nOne thing it doesn't support is updating existing values in a field. That would need some changes to existing script but we never needed it so I left it out.\n\nYou pass the new schema as a list of field definitions:\n\n```\nschema = [\n {\"field_name\": \"file_id\", \"datatype\": \"VARCHAR\", \"max_length\": 500},\n {\"field_name\": \"vector\", \"datatype\": \"FLOAT_VECTOR\", \"dim\": 1024},\n {\"field_name\": \"text\", \"datatype\": \"VARCHAR\", \"max_length\": 65535},\n {\"field_name\": \"metadata\", \"datatype\": \"JSON\"},\n {\"field_name\": \"page_num\", \"datatype\": \"VARCHAR\", \"max_length\": 500},\n {\"field_name\": \"name\", \"datatype\": \"VARCHAR\", \"max_length\": 500}\n]\n```\n\nThe `id`\n\nprimary key is handled automatically you don't need to include it. You can also assign a default here.\n\n``` python\nfrom pymilvus import MilvusClient, DataType\n\nclass MilvusManager:\n    def __init__(self, database):\n        self.client = MilvusClient(\n            uri=\"\",\n            token=\"\"\n        )\n        self.client.use_database(database)\n\n    def create_schema(self):\n        schema = self.client.create_schema()\n        schema.add_field(\n            field_name=\"id\", datatype=DataType.INT64, is_primary=True, auto_id=True\n        )\n        return schema\n\n    def create_index(self):\n        index_params = self.client.prepare_index_params()\n        return index_params\n\n    def create_collection(self, collection_name):\n        if self.client.has_collection(collection_name):\n            return\n\n        return self.client.create_collection(\n            collection_name=collection_name,\n            schema=self.create_schema(),\n            index_params=self.create_index(),\n        )\n\n    def drop_collection(self, collection_name):\n        if not self.client.has_collection(collection_name):\n            return\n        return self.client.drop_collection(collection_name=collection_name)\n\n    def insert_row(self, collection_name, data):\n        if not self.client.has_collection(collection_name):\n            self.create_collection(collection_name)\n        return self.client.insert(collection_name=collection_name, data=data)\n\n    def delete_rows(self, collection_name, filter_expr):\n        self.client.load_collection(collection_name=collection_name)\n        results = self.client.delete(\n            collection_name=collection_name, filter=filter_expr\n        )\n        self.client.release_collection(collection_name=collection_name)\n        return results\n\n    def query(self, collection_name, filter, output_fields=None):\n        if not output_fields:\n            output_fields = [\"*\"]\n\n        self.client.load_collection(collection_name=collection_name)\n        data = self.client.query(\n            collection_name=collection_name, filter=filter, output_fields=output_fields\n        )\n        self.client.release_collection(collection_name=collection_name)\n        return data\n\n    def upsert(self, collection_name, data):\n        self.client.load_collection(collection_name=collection_name)\n        self.client.upsert(collection_name=collection_name, data=data)\n        self.client.release_collection(collection_name=collection_name)\n\n    def close(self):\n        self.client.close()\n```\n\n**Quick Note**: It was created for our internal usecase.\n\n``` python\nimport logging\n\nimport numpy as np\nfrom milvus_manager import MilvusManager\nfrom pymilvus import DataType\n\nlogger = logging.getLogger(\"django\")\n\ndef get_field_schema(client, new_schema):\n    \"\"\"\n    Convert the old schema (JSON) to milvus Collection Schema\n    \"\"\"\n    logger.info(\"Creating new schema...\")\n    schema = client.create_schema()\n    type = {x.name: x.value for x in list(DataType)}\n\n    for i, x in enumerate(new_schema):\n        typ = x.get(\"datatype\")\n\n        val = type.get(typ.upper())\n\n        if not val:\n            logger.info(f\"Invalid datatype at pos {i} {x}\")\n            raise ValueError(\"Invalid datatype. Check logs for more info\")\n\n        x[\"datatype\"] = DataType(val)\n\n        try:\n            schema.add_field(**x)\n        except:\n            logger.info(f\"Invalid parameter at pos {i} {x}\")\n            raise\n\n    schema.add_field(field_name=\"id\", datatype=DataType.INT64, is_primary=True, auto_id=True)\n\n    return schema\n\ndef get_clean_fields(fields):\n    for x in fields:\n        x[\"type\"] = x[\"type\"].name\n        if \"default_value\" in x:\n            x[\"default_value\"] = x[\"default_value\"].ListFields()[0][-1]\n\n    return fields\n\ndef get_fields(milvus_manager, collection_name):\n    \"\"\"\n    Get the field names ffrom old collection\n    \"\"\"\n    old_schema_fields = milvus_manager.client.describe_collection(collection_name)[\"fields\"]\n\n    old_schema_fields = [field for field in old_schema_fields if field[\"name\"] != \"id\"]\n    field_names = {f[\"name\"] for f in old_schema_fields}\n\n    return old_schema_fields, field_names\n\ndef convert_new_schema(new_schema, output_fields):\n    \"\"\"\n    Add default values to the new schema\n    \"\"\"\n    fields = []\n\n    for field in new_schema:\n        if field.get(\"id\"):\n            logger.info(\"Field id detected.. It will be automatically rewritten\")\n            continue\n\n        typ = field.get(\"datatype\")\n\n        if field.get(\"field_name\") in output_fields:\n            fields.append(field)\n            continue\n\n        if typ:\n            if field.get(\"default_value\"):\n                fields.append(field)\n                continue\n\n            if typ.lower() == \"varchar\":\n                field[\"default_value\"] = \"\"\n\n                if not field.get(\"max_length\"):\n                    field[\"max_length\"] = 65535\n\n            elif typ.lower() == \"bool\":\n                field[\"default_value\"] = True\n\n            elif typ.lower() == \"int8\":\n                field[\"default_value\"] = np.int8(0)\n\n            elif typ.lower() == \"int16\":\n                field[\"default_value\"] = np.int16(0)\n\n            elif typ.lower() == \"int32\":\n                field[\"default_value\"] = np.int32(0)\n\n            elif typ.lower() == \"int64\":\n                field[\"default_value\"] = np.int64(0)\n\n            elif typ.lower() == \"float\":\n                field[\"default_value\"] = np.float32(3.14)\n\n            elif typ.lower() == \"double\":\n                field[\"default_value\"] = np.float64(3.14)\n\n            fields.append(field)\n        else:\n            logger.log(f\"Data type not provided terminating .... {field}\")\n            raise ValueError(\"Datatype not provided...\")\n\n    return fields\n\ndef create_collection(milvus_manager, new_collection, schema):\n    logger.info(f\"Creating collection {new_collection}\")\n    index_params = milvus_manager.create_index()\n\n    milvus_manager.client.create_collection(\n        collection_name=new_collection,\n        schema=schema,\n        index_params=index_params,\n        enable_dynamic_field=True,\n    )\n    logger.info(f\"Collection {new_collection} created\")\n\ndef get_collection_iterator(milvus_manager, collection_name, output_fields, batch_size):\n    logger.info(f\"Fetching data from {collection_name} with batch size {batch_size}\")\n    milvus_manager.client.load_collection(collection_name)\n    results = milvus_manager.client.query_iterator(\n        collection_name=collection_name,\n        filter=\"id>0\",\n        batch_size=batch_size,\n        output_fields=output_fields,\n    )\n    logger.info(\"Data fetched\")\n    return results\n\ndef insert_into_collection(milvus_manager, new_collection, results):\n    milvus_manager.client.load_collection(collection_name=new_collection)\n\n    logger.info(f\"Inserting into collection {new_collection}\")\n    while True:\n        result = results.next()\n\n        if not result:\n            break\n\n        for x in result:\n            if \"id\" in x:\n                del x[\"id\"]\n\n        milvus_manager.client.insert(collection_name=new_collection, data=result)\n    logger.info(f\"Insertion into collection {new_collection} completed\")\n\ndef cleanup(milvus_manager, collection_name, new_collection):\n    logger.info(f\"Dropping old collection {collection_name}\")\n    milvus_manager.drop_collection(collection_name)\n    logger.info(f\"Old collection {collection_name} dropped\")\n\n    logger.info(f\"Collection {new_collection} renamed\")\n    milvus_manager.client.rename_collection(new_collection, collection_name)\n\ndef alter_collection(milvus_manager, collection_name, new_schema, batch_size):\n    \"\"\"\n    Here we:\n\n    1) Drop the new collection if it already exists\n    2) Compare new fields with old fields and filter out `id` field\n    3) Add default values to the new schema\n    4) Convert the new schema (json) to collection schema\n    5) Get the data in batches from old collection and insert into new collection\n    6) Delete the old collection and rename the new collection.\n    \"\"\"\n    new_collection = f\"{collection_name}_temp\"\n\n    milvus_manager.drop_collection(new_collection)\n\n    old_fields, old_names = get_fields(milvus_manager, collection_name)\n    new_names = {f[\"field_name\"] for f in new_schema}\n\n    tmp = new_names - old_names\n\n    if len(tmp) == 1:\n        if list(tmp)[0] == \"id\":\n            logger.info(\"No changes detected terminating...\")\n            return\n\n    if new_names == old_names:\n        logger.info(\"No changes detected terminating...\")\n        return\n\n    output_fields = list(new_names.intersection(old_names))\n\n    new_schema = convert_new_schema(new_schema, output_fields)\n\n    old_fields = get_clean_fields(old_fields)\n\n    schema = get_field_schema(milvus_manager.client, new_schema)\n\n    results = get_collection_iterator(milvus_manager, collection_name, output_fields, batch_size)\n\n    try:\n        create_collection(milvus_manager, new_collection, schema)\n\n        insert_into_collection(milvus_manager, new_collection, results)\n\n        cleanup(milvus_manager, collection_name, new_collection)\n    except Exception as e:\n        logger.info(f\"Got a error for collection {collection_name} : {str(e)} \", exc_info=True)\n\ndef main(database, collection_name, new_schema, batch_size):\n    \"\"\"The entrypoint\"\"\"\n    milvus = MilvusManager(database)\n    alter_collection(milvus, collection_name, new_schema, batch_size)\n```\n\nAfter this I used this function inside django managemen command.\n\n```\nclient.add_collection_field(\n    collection_name=\"my_collection\",\n    field_name=\"new_field\",\n    datatype=DataType.VARCHAR,\n    max_length=500,\n    nullable=True\n)\n```\n\nIt does not support removing old field and adding default.\n\nThanks for reading! Let's connect:", "url": "https://wpnews.pro/news/how-i-built-a-milvus-alter-command-in-django-before-native-support-existed", "canonical_source": "https://dev.to/anuj66283/how-i-built-a-milvus-alter-command-in-django-before-native-support-existed-5g6f", "published_at": "2026-06-29 05:03:26+00:00", "updated_at": "2026-06-29 05:27:04.056924+00:00", "lang": "en", "topics": ["developer-tools", "machine-learning", "ai-infrastructure"], "entities": ["Milvus", "Django", "MilvusClient", "DataType"], "alternates": {"html": "https://wpnews.pro/news/how-i-built-a-milvus-alter-command-in-django-before-native-support-existed", "markdown": "https://wpnews.pro/news/how-i-built-a-milvus-alter-command-in-django-before-native-support-existed.md", "text": "https://wpnews.pro/news/how-i-built-a-milvus-alter-command-in-django-before-native-support-existed.txt", "jsonld": "https://wpnews.pro/news/how-i-built-a-milvus-alter-command-in-django-before-native-support-existed.jsonld"}}