{"slug": "building-lstms-with-pytorch-and-lightning-ai-part-3-finishing-the-lstm-cell", "title": "Building LSTMs with PyTorch and Lightning AI Part 3: Finishing the LSTM Cell", "summary": "A developer completed building an LSTM cell using PyTorch and Lightning AI, implementing the forward pass and optimizer. The LSTM unit processes stock price inputs over four days, updating long-term and short-term memory. The model uses the Adam optimizer for training.", "body_md": "In the [previous article](https://dev.to/rijultp/building-lstms-with-pytorch-and-lightning-ai-part-2-starting-the-lstm-unit-implementation-175o), we started with the creation of LSTM cell.\n\nIn this article we will continue building the LSTM Unit as well as create the forward pass and the optimizer.\n\nIn this stage, we create the updated short-term memory and determine what percentage of it should be sent to the output.\n\nFirst, we calculate the output percentage:\n\n```\noutput_percent = torch.sigmoid(\n    (short_memory * self.wo1) +\n    (input_value * self.wo2) +\n    self.bo1\n)\n```\n\nHere:\n\n`wo1`\n\nis the weight associated with the current short-term memory.`wo2`\n\nis the weight associated with the current input value.`bo1`\n\nis the bias term.The sigmoid function produces a value between 0 and 1, representing the percentage of information that should be passed to the output.\n\nNext, we use this percentage to scale the new short-term memory.\n\nWe first apply the tanh activation function to the updated long-term memory, and then multiply the result by `output_percent`\n\n.\n\n```\nupdated_short_memory = torch.tanh(updated_long_memory) * output_percent\n```\n\nFinally, we return the updated long-term and short-term memory values:\n\n```\nreturn [updated_long_memory, updated_short_memory]\n```\n\nAt this point, our `lstm_unit()`\n\nfunction is complete.\n\n``` python\ndef lstm_unit(self, input_value, long_memory, short_memory):\n\n    long_remember_percent = torch.sigmoid(\n        (short_memory * self.wlr1) +\n        (input_value * self.wlr2) +\n        self.blr1\n    )\n\n    potential_remember_percent = torch.sigmoid(\n        (short_memory * self.wpr1) +\n        (input_value * self.wpr2) +\n        self.bpr1\n    )\n\n    potential_memory = torch.tanh(\n        (short_memory * self.wp1) +\n        (input_value * self.wp2) +\n        self.bp1\n    )\n\n    updated_long_memory = (\n        (long_memory * long_remember_percent) +\n        (potential_remember_percent * potential_memory)\n    )\n\n    output_percent = torch.sigmoid(\n        (short_memory * self.wo1) +\n        (input_value * self.wo2) +\n        self.bo1\n    )\n\n    updated_short_memory = (\n        torch.tanh(updated_long_memory) * output_percent\n    )\n\n    return [updated_long_memory, updated_short_memory]\n```\n\nNow that we have implemented the LSTM unit, the next step is to create the `forward()`\n\nmethod that performs a forward pass through the unrolled LSTM.\n\nFor this example, the input will be the stock prices from the previous four days.\n\nFirst, we initialize the long-term and short-term memory values:\n\n``` python\ndef forward(self, input):\n    long_memory = 0\n    short_memory = 0\n```\n\nNext, we process each day's stock price through the LSTM unit:\n\n``` python\ndef forward(self, input):\n\n    long_memory = 0\n    short_memory = 0\n\n    day1 = input[0]\n    day2 = input[1]\n    day3 = input[2]\n    day4 = input[3]\n\n    long_memory, short_memory = self.lstm_unit(\n        day1, long_memory, short_memory\n    )\n\n    long_memory, short_memory = self.lstm_unit(\n        day2, long_memory, short_memory\n    )\n\n    long_memory, short_memory = self.lstm_unit(\n        day3, long_memory, short_memory\n    )\n\n    long_memory, short_memory = self.lstm_unit(\n        day4, long_memory, short_memory\n    )\n\n    return short_memory\n```\n\nHere, the same LSTM unit is reused for each day's input. As each value is processed, the long-term and short-term memory are updated and carried forward to the next step.\n\nAfter the fourth day, we return the final short-term memory, which serves as the output of the LSTM.\n\nNow that we have a `forward()`\n\nmethod capable of performing a forward pass through the unrolled LSTM, we are ready to configure the optimizer.\n\nThis is straightforward:\n\n``` python\ndef configure_optimizers(self):\n    return Adam(self.parameters())\n```\n\nThis tells Lightning to use the Adam optimizer to train all trainable parameters in the model.\n\nIn the next article, we will explore the `training_step()`\n\nmethod, which is responsible for calculating the loss during training.\n\nAI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.\n\n[git-lrc](https://github.com/HexmosTech/git-lrc) fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.\n\nAny feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.\n\nGive it a ⭐ [star on Github](https://github.com/HexmosTech/git-lrc)", "url": "https://wpnews.pro/news/building-lstms-with-pytorch-and-lightning-ai-part-3-finishing-the-lstm-cell", "canonical_source": "https://dev.to/rijultp/building-lstms-with-pytorch-and-lightning-ai-part-3-finishing-the-lstm-cell-1fo0", "published_at": "2026-06-24 19:05:42+00:00", "updated_at": "2026-06-24 19:09:04.198705+00:00", "lang": "en", "topics": ["machine-learning", "neural-networks", "developer-tools"], "entities": ["PyTorch", "Lightning AI", "Adam"], "alternates": {"html": "https://wpnews.pro/news/building-lstms-with-pytorch-and-lightning-ai-part-3-finishing-the-lstm-cell", "markdown": "https://wpnews.pro/news/building-lstms-with-pytorch-and-lightning-ai-part-3-finishing-the-lstm-cell.md", "text": "https://wpnews.pro/news/building-lstms-with-pytorch-and-lightning-ai-part-3-finishing-the-lstm-cell.txt", "jsonld": "https://wpnews.pro/news/building-lstms-with-pytorch-and-lightning-ai-part-3-finishing-the-lstm-cell.jsonld"}}