Building LSTMs with PyTorch and Lightning AI Part 1: First Steps with LSTMs

wpnews.pro

cd /news/machine-learning/building-lstms-with-pytorch-and-ligh… · home › topics › machine-learning › article

[ARTICLE · art-35785] src=dev.to ↗ pub=2026-06-21T18:25Z topic=machine-learning verified=true sentiment=· neutral

Building LSTMs with PyTorch and Lightning AI Part 1: First Steps with LSTMs

A developer implemented an LSTM from scratch using PyTorch and Lightning AI, detailing the initialization of weights and biases for the forget, input, cell candidate, and output gates. The tutorial uses the Adam optimizer and normal distribution for weight initialization, with all parameters set to trainable.

read2 min views1 publishedJun 21, 2026

In this article, we will explore how to implement an LSTM using PyTorch and Lightning.

For more details about LSTMs, there is a separate series of articles available here.

To begin, we first import the required modules.

import torch
import torch.nn as nn
import torch.nn.functional as F

We also introduce a new optimizer:

from torch.optim import Adam

Adam is used to fit the neural network to the data.

It works similarly to SGD, but in practice, Adam often converges faster and adapts the learning rate more effectively.

Next, we continue with the remaining imports:

import lightning as L
from torch.utils.data import TensorDataset, Data

We define the neural network by creating a Lightning module.

class LSTMByHand(L.LightningModule):
    def __init__(self):

    def lstm_unit(self, input_value, long_memory, short_memory):

    def forward(self, input):

    def configure_optimizers(self):

    def training_step(self, batch, batch_idx):

Now let’s implement the __init__

method.

This is where we initialize all weights and biases.

class LSTMByHand(L.LightningModule):
    def __init__(self):
        super().__init__()

        mean = torch.tensor(0.0)  # Mean of the normal distribution
        std = torch.tensor(1.0)   # Standard deviation

        self.wlr1 = nn.Parameter(torch.normal(mean=mean, std=std), requires_grad=True)
        self.wlr2 = nn.Parameter(torch.normal(mean=mean, std=std), requires_grad=True)
        self.blr1 = nn.Parameter(torch.tensor(0.0), requires_grad=True)

        self.wpr1 = nn.Parameter(torch.normal(mean=mean, std=std), requires_grad=True)
        self.wpr2 = nn.Parameter(torch.normal(mean=mean, std=std), requires_grad=True)
        self.bpr1 = nn.Parameter(torch.tensor(0.0), requires_grad=True)

        self.wp1 = nn.Parameter(torch.normal(mean=mean, std=std), requires_grad=True)
        self.wp2 = nn.Parameter(torch.normal(mean=mean, std=std), requires_grad=True)
        self.bp1 = nn.Parameter(torch.tensor(0.0), requires_grad=True)

        self.wo1 = nn.Parameter(torch.normal(mean=mean, std=std), requires_grad=True)
        self.wo2 = nn.Parameter(torch.normal(mean=mean, std=std), requires_grad=True)
        self.bo1 = nn.Parameter(torch.tensor(0.0), requires_grad=True)

Unlike earlier examples, we initialize weights using a normal distribution.

Before moving further, let’s understand what that means.

Imagine measuring the heights of a large group of people:

When plotted, this forms a symmetric bell-shaped curve.

This is called a normal distribution.

We use:

0

1

Also, all parameters have requires_grad=True

, meaning they will be trained during backpropagation.

Next, we will explore the lstm_unit

function and how the LSTM actually processes information step by step.

AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.

git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.

Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.

Give it a ⭐ star on Github

source & further reading

dev.to — original article BITCOIN HACKATHON How I Built a Turing inspired 3D Solstice Runner Powered by Google AI The Imitation Game: Most people think they can spot an AI. Are you sure?

~/api · this article 200

$curl api.wpnews.pro/v1/news/building-lstms-with-pyto…

Read original on dev.to → dev.to/rijultp/building-lstms-with-pytorch-and-l…

mentioned entities

PyTorch

Lightning AI

Adam

metadata

slugbuilding-lstms-with-pytorch-and-lightning-ai-part-1-first-steps-with-lstms

topic#machine-learning

secondary2 topics

sentimentneutral

canonicaldev.to

navigation

← prevThe Imitation Game: Most people …

next →How I Built a Turing inspired 3D…

── more in #machine-learning 4 stories · sorted by recency

dev.to · 20 Jun · #machine-learning

Neural Networks with PyTorch and Lightning AI Part 5: Final Results and GPU Acceleration

dev.to · 21 Jun · #machine-learning

I Fine-Tuned a 270M Model on My Laptop (Full Fine-Tuning, From Scratch)

lifeiscomputation.com · 21 Jun · #machine-learning

Are Transformers Turing-Complete? A Good Disguise Is All You Need

FareedKhan-dev.github.io · 21 Jun · #machine-learning

Train LLM from Scratch

── more on @pytorch 3 stories trending now

wpnews · 20 Jun · #ai-agents

Amazon Bedrock AgentCore Memory: Build AI Agents That Remember

wpnews · 21 Jun · #large-language-models

Anthropic faces a class action lawsuit accusing it of selling Claude Max subscribers far less than advertised

wpnews · 20 Jun · #artificial-intelligence

Microsoft is rewriting the economics of enterprise AI and the bill shock is just getting started

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required