Adding Window Previews to Dash to Dock: A GNOME Shell Extension Deep Dive

I ported, debugged, and shipped a window preview feature for Dash to Dock — one of the most popular GNOME Shell extensions. This post documents the journey: what worked, what broke repeatedly, and the hard-won lessons about GNOME Shell extension development on Wayland.

The Starting Point: A Stale PR

It began with PR #574 on the upstream repository — an old pull request that added window preview popups when hovering over dock icons. The code was outdated and no longer compatible with current GNOME Shell. I pulled it down, analysed it, and set about porting it to the modern GNOME Shell API.

The goal was straightforward: when a user hovers over an application icon in the dock, show live thumbnail previews of that application’s open windows — similar to the taskbar preview feature in Windows.

The Focus Nightmare

The single hardest problem was focus management. GNOME Shell’s popup menu system uses a modal grab model — when a popup opens, it captures input events. This is fine for right-click context menus, but for hover-triggered preview windows it creates a nightmare:

  • The preview popup would steal focus — once shown, moving the mouse to another dock icon did nothing. The user was trapped.
  • The only escape was clicking elsewhere — hover-out events were swallowed by the modal grab.
  • Fixing focus broke the context menu — every attempt to make the preview release focus would break the right-click context menu. This happened at least 5 times during development.

The solution required carefully managing the popup’s reactive state: the preview needed to be open for rendering purposes but not capture input in the way a traditional menu does. We had to iterate through all dock icons on enter/leave events, explicitly closing previews for other applications.

Debugging on Wayland

GNOME Shell extensions on Wayland present unique debugging challenges. You cannot simply restart the shell — you need a nested Wayland session:

dbus-run-session -- gnome-shell --nested --wayland

Copy-paste does not work between the nested session and the host, so I added file-based logging early on to /tmp/dash-to-dock-DATE.log. This turned out to be essential. The log output was invaluable for analysis, creating a tight feedback loop: run the extension, reproduce the bug, analyse the logs, apply a fix, rebuild, repeat.

The Open/Close/Open Flicker Bug

One persistent bug: on the first hover over any icon, the preview would appear, disappear, then appear again — a visible flicker. It only happened once per icon, then worked fine afterward.

The root cause was a race condition between the hover-open timeout (300ms delay before showing) and the menu system’s own open/close signals. When the popup first rendered, GNOME Shell’s menu manager would fire a menu-closed signal during initial layout, which triggered our close handler, which then got re-triggered by the still-active hover state.

The fix was a justOpened guard flag that suppressed close events for a brief window after the initial open.

Window Preview Sizing

Getting the preview thumbnails to size correctly went through several iterations:

  1. First attempt: Fixed-size preview boxes — resulted in disproportionate previews with large empty spaces for narrow windows.
  2. Second attempt: Scale app thumbnails to fit the preview box — made everything look like squashed squares.
  3. Final approach: Dynamic width based on the application window’s aspect ratio, with a fixed height and a maximum width cap at 90% of desktop horizontal space. Single windows get a tight fit; multiple windows of the same app expand horizontally.

Aero Peek: Fading Away the Clutter

With previews working, I added an Aero Peek feature inspired by Windows: when hovering over a specific window thumbnail in the preview popup, all other windows fade to near-transparency (1% opacity), letting the user see the target window beneath the stack.

I initially tried full transparency (0% opacity) with a glow outline around windows, but GNOME Shell’s compositor does not support per-window glow effects without custom shaders. The simple opacity fade at 1% proved effective enough — windows are essentially invisible but technically still rendered, avoiding compositor edge cases.

Animation Styles That Refused to Animate

I added a settings dropdown for preview animation styles: Instant, Fade, Slide, Scale, Expand, and Dissolve. The settings UI worked, the values were stored and read correctly, the animation config function returned the right style… but nothing visually changed.

Over a dozen debug sessions later, the issue was that the animation transitions were being applied after the popup was already visible. The Clutter animation framework needs the initial state set before the actor is shown, then the target state set to trigger the transition. We were setting both states after show(), so there was nothing to animate between.

The Translation Gap

The feature introduced new user-facing strings — Show window previews on mouse hover, animation style names, etc. These all needed translations across the extension’s supported languages. GNOME Shell extensions use gettext, and each string needed entries in every .po file.

Key Takeaways for GNOME Shell Extension Development

  1. File-based logging is non-negotiable on Wayland — console.log goes to the journal, but you cannot easily copy from a nested session. Write to /tmp/.
  2. The menu/popup system is a minefield — GNOME Shell’s PopupMenu assumes modal interaction. If you need non-modal popups (like hover previews), expect to fight the framework.
  3. Test the right-click menu after every change — focus and event handling changes have a remarkable ability to break context menus silently.
  4. Race conditions are the default — hover events, timeouts, animation completions, and menu signals all fire asynchronously. Guard everything.
  5. Aspect ratio math is harder than it sounds — single window vs. multi-window layout, mixed landscape/portrait windows, and screen size limits all interact.
  6. Use a nested Wayland session — dbus-run-session gnome-shell nested wayland saves you from logging out after every crash.

The Result

The final feature, submitted as PR #2470, adds:

  • Hover-triggered window preview popups with live thumbnails
  • Dynamic sizing based on window aspect ratios
  • Clickable previews that bring windows to focus
  • Close buttons on individual previews
  • Aero Peek transparency effect
  • Multiple animation styles (fade, slide, scale, expand, dissolve)
  • Late-arriving window detection (preview appears when an app finishes launching)
  • Full translation support

Dozens of iterations, countless regressions, and one very stubborn focus-stealing bug later — it works.


S0ix Sleep and Hibernate on Meteor Lake, NVIDIA and the ThinkPad P1 Gen 7

Getting modern sleep (S0ix / “Modern Standby”) working on a Lenovo ThinkPad P1 Gen 7 (21KV) with Intel Meteor Lake and an NVIDIA RTX 4060 has been a significant undertaking. This post documents the hardware, the problems encountered, every fix applied, and the remaining blockers — in the hope it saves someone else the same multi-day debugging session.

Hardware

  • Laptop: Lenovo ThinkPad P1 Gen 7 (21KV0025UK)
  • CPU: Intel Core Ultra 7 165H (Meteor Lake)
  • GPU: NVIDIA RTX 4060 Max-Q Mobile (AD107M) — PRIME offload only, the Intel iGPU drives the display
  • OS: Fedora 43, kernel 6.19.x

The Problem: S0ix Does Not Work (Out of the Box)

Meteor Lake dropped legacy S3 sleep entirely. The only suspend option is S0ix (also called s2idle / Modern Standby), which requires every single device and subsystem to reach a low-power state. If any component stays awake, the CPU package cannot enter its deepest idle state and the laptop drains battery as if it were still running.

On this machine, over 35 PMC substate requirements were unmet out of the box. The root cause traces to the IOE die (PMC1) failing to power-gate, which appears to be a CSME firmware issue. The behaviour is even boot-dependent — some boots may partially work where others will not.

Kernel Parameters

The following kernel parameters were added to /etc/default/grub on the GRUB_CMDLINE_LINUX line:

pcie_aspm=force acpi.ec_no_wakeup=1 nmi_watchdog=0 snd_hda_intel.power_save=1
  • pcie_aspm=force — Force Active State Power Management on PCIe devices that don’t advertise support. Required because several Meteor Lake root ports won’t enter L1 otherwise.
  • acpi.ec_no_wakeup=1 — Prevent the Embedded Controller from generating spurious wakeups. ThinkPads are notorious for this.
  • nmi_watchdog=0 — Disable the NMI watchdog. It prevents the CPU from entering deep C-states.
  • snd_hda_intel.power_save=1 — Enable power saving on the HDA audio codec (1 second timeout).

ACPI Wake Sources

Several ACPI wake sources were causing immediate or spurious wake from suspend. A systemd service was created to disable them at boot:

# /etc/systemd/system/disable-wakeup-sources.service
# Disables: RP12, TXHC, TDM0, TRP0, TRP1

[Unit]
Description=Disable problematic ACPI wake sources
After=multi-user.target

[Service]
Type=oneshot
ExecStart=/bin/bash -c 'for src in RP12 TXHC TDM0 TRP0 TRP1; do echo $src > /proc/acpi/wakeup || true; done'

[Install]
WantedBy=multi-user.target

These sources relate to PCIe root ports and Thunderbolt controllers that would wake the machine immediately after suspend or on any USB-C cable event.

PMC LTR (Latency Tolerance Reporting) Overrides

Several devices were reporting LTR values that prevented the SoC from entering S0ix. These were overridden by writing to the PMC LTR ignore registers:

# LTR ignore indices:
# 1  = SOUTHPORT_B
# 3  = GBE (Gigabit Ethernet)
# 6  = ME (Management Engine)
# 8  = SOUTHPORT_C
# 25 = IOE_PMC

The GBE and ME entries are particularly important — the Intel Management Engine’s LTR blocks S0ix on virtually every Meteor Lake system unless ignored.

USB Subsystem

Two USB devices were holding the XHCI controller in D0, blocking package C-state entry:

  • The fingerprint reader
  • The Bluetooth adapter

Both were resolved by enabling USB autosuspend for all devices. Runtime PM was also explicitly enabled on the Thunderbolt host controller.

BIOS: Disable Intel AMT

Intel Active Management Technology (AMT) was enabled by default in the BIOS. This caused three PMC blockers:

  • KVMCC — KVM Controller for remote management
  • USBR0 — USB Redirection
  • SUSRAM — Suspend RAM used by AMT

Disabling AMT in the BIOS (Config → Network → Intel AMT) immediately cleared all three.

NVIDIA GPU Power Management

The NVIDIA GPU required extensive configuration to stop it from blocking low-power states:

Services Disabled

  • nvidia-powerd — Was causing the GPU to cycle between D3cold and D0 repeatedly
  • nvidia-persistenced — Kept a handle on the GPU, preventing it from entering D3cold

D3cold Enabled

Runtime power management and D3cold (full power off) were enabled for the NVIDIA GPU via /etc/modprobe.d/nvidia-pm.conf.

Proprietary Driver with GSP Firmware Disabled

The open-source nvidia-open kernel module was replaced with the proprietary NVIDIA driver, and GSP (GPU System Processor) firmware was disabled:

options nvidia NVreg_EnableGpuFirmware=0

This was necessary because the GSP firmware interfered with proper suspend/resume cycling on this GPU generation.

CUDA Isolation for Background Services

Any background service that might poke the GPU (e.g., monitoring tools, bots) was configured with:

Environment=CUDA_VISIBLE_DEVICES=

This prevents them from initialising the CUDA runtime and keeping the GPU in D0.

Other Fixes

  • intel-lpmd — Intel’s Low Power Mode Daemon was installed to help manage CPU power states on Meteor Lake.
  • SELinux — A custom /data partition had incorrect SELinux contexts, which was blocking systemd-logind from writing hibernate state. Relabelling the partition fixed it.

Hibernate: Broken

Hibernate (suspend-to-disk) does not work on this hardware combination and is unlikely to work any time soon.

The kernel panics during the hibernate image write phase. The NVIDIA driver’s pci_pm_freeze callback either fails or causes a kernel panic. This was tested with:

  • The open-source nvidia-open driver
  • The proprietary NVIDIA driver
  • GSP firmware on and off
  • Writing via /sys/power/state (procfs) directly
  • Skipping VT switch (no chvt)

All combinations result in a panic. This is a known upstream bug tracked at NVIDIA/open-gpu-kernel-modules Issue #922.

Do not attempt hibernate on this platform with NVIDIA drivers loaded.

What Still Blocks S0ix

Even after all the above fixes, S0ix is not fully achieved. The remaining blockers are in the PMC and relate to PLLs and fabric that cannot be controlled from userspace:

PMC1 (IOE Die)

  • SBR0–4 (Sideband Router instances)
  • FABRIC_PLL, TMU_PLL, BCLK_PLL, D2D_PLL, G5FPW PLLs, REF_PLL
  • VNN_SOC_REQ_STS

PMC0 (SoC Die)

  • LPSS — I2C4 stuck in D0 (likely the touchpad or touchscreen controller)
  • FABRIC_PLL, IOE_COND_MET
  • ITSS_CLK_SRC_REQ_STS
  • LSX_Wake4–7

These are firmware-level blockers. The IOE die (PMC1) issue in particular appears to require a CSME firmware update from Lenovo.

Things to Avoid

  • Do not restart systemd-logind — It will kill your entire GNOME session.
  • Do not use sleep hooks that run modprobe -r nvidiagnome-shell holds nvidia_drm open; the unload will hang.
  • Do not use rtcwake -m mem — It bypasses the NVIDIA systemd suspend services and will likely result in a broken resume.
  • Do not attempt hibernate — Kernel panic (see above).

Configuration Files Modified

For reference, the following files were created or modified during this process:

  • /etc/default/grub — Kernel parameters
  • /etc/modprobe.d/nvidia-pm.conf — NVIDIA power management options
  • /etc/modprobe.d/nvidia-installer-disable-nouveau.conf — Nouveau blacklist
  • /etc/systemd/logind.conf.d/lid-suspend-then-hibernate.conf — Lid action configuration
  • /etc/systemd/sleep.conf — Sleep/hibernate configuration
  • /etc/systemd/system/disable-wakeup-sources.service — ACPI wake source disabling
  • /usr/lib/systemd/system-sleep/ollama-nvidia-hibernate.sh — Pre/post sleep hooks

Conclusion

Modern Standby on Meteor Lake with NVIDIA discrete graphics is a minefield. While many of the userspace-controllable blockers can be resolved, the core IOE die power-gating issue and the NVIDIA hibernate panic are firmware bugs that no amount of kernel tuning can fix. If you’re buying a Meteor Lake workstation for Linux, check the NVIDIA hibernate bug status and your vendor’s CSME firmware changelog before committing.

]]>


Successfully Redeploying the Feeditout Service with Ansible

After a long journey of iteration, troubleshooting, and learning, I’m excited to share that I’ve successfully redeployed the Feeditout service using Ansible.

This wasn’t just a redeployment — it was a full re-architecture of how the system is provisioned, secured, monitored, and maintained. I went deep into infrastructure-as-code territory and came out the other side with a more robust, modular, and maintainable setup than ever before.

Lessons from My Ansible Journey

At the heart of this process was Ansible — and it’s fair to say I’ve come a long way in mastering it. What began as a handful of playbooks quickly evolved into a library of roles, reusable tasks, and templated configuration files.

I focused heavily on idempotency, readability, and separation of concerns. Along the way, I developed a strong preference for minimal inline logic and clean, descriptive variable names. I also became comfortable enforcing good practices like avoiding item as a loop variable and steering clear of unnecessary block statements unless needed.

Roles I Wrote

Here’s a snapshot of the roles I built and used during this process — each one crafted with purpose:

  • aide
  • alert_manager
  • ansible_pull
  • apache2
  • apparmor
  • apt
  • auditd
  • base_packages
  • certbot
  • chkrootkit
  • chuckbot
  • clamav
  • clean
  • cockpit
  • cron
  • dns
  • entropy
  • fail2ban
  • fail2counter
  • grafana
  • grub
  • hostname
  • iptables
  • kernel
  • keyboard
  • locale
  • logrotate
  • logwatch
  • memcached
  • motd
  • mysql
  • network_manager
  • node_exporter
  • ntp
  • opendkim
  • opendmarc
  • pam
  • passwd
  • php_fpm
  • postfix
  • postsrsd
  • prometheus
  • rclone
  • redis
  • root_password
  • rsyslog
  • saslauthd
  • services
  • spamassassin
  • sshd
  • sudo
  • swap
  • wayland

From security hardening (auditd, chkrootkit, aide, fail2ban) to service monitoring (grafana, prometheus, alert_manager), mail stack configuration (postfix, opendkim, opendmarc, postsrsd, saslauthd), and even custom integrations like chuckbot, every role played a part.

Each role encapsulates everything needed to configure a specific service — packages, configuration files, systemd services, and sensible defaults — while remaining fully overrideable via host_vars.

The Payoff

Feeditout is now:

  • Secure by default with automated auditing, logging, and spam controls.
  • Monitored with a complete Prometheus + Grafana setup and alert routing.
  • Configured from scratch using a fully automated Ansible repo.
  • Easier to maintain, extend, and recover from disaster.

Most importantly, I now have confidence in my infrastructure, because it’s reproducible and self-documented through code.

What’s Next?

Now that the foundation is solid, I’ll be iterating on:

  • Self-healing features (auto-restart, watchdogs),
  • Zero-downtime deployments,
  • Better observability dashboards,
  • Maybe even a public Git repo or guide for others to use and learn from.

If you’re thinking about doing something similar — take the plunge. It’s a challenge, but you’ll learn more about your systems and tools than you ever could from reading docs alone.


Exciting Updates in Jira Creator – Version 1.0.3 Released!

Hey there, fellow developers! I’m thrilled to share the latest updates from our project, Jira Creator, with the release of version 1.0.3. This update brings a variety of changes, from minor tweaks to significant improvements, all aimed at enhancing your experience with our tool.

What’s Changed?

In this release, we made several changes across various files, focusing on improving the overall structure and functionality of the tool. Here’s a quick rundown of what’s new:

  • Updated the environment variable names for clarity and consistency.
  • Refactored the code to improve readability and maintainability.
  • Fixed bugs related to AI provider integration and issue management.
  • Bumped the version number from 0.0.43 to 1.0.3, marking a significant milestone.

How Does This Change the Project?

This update is more than just a version bump. By renaming environment variables to include the prefix “JIRA_”, we’ve made it easier for users to understand what each variable relates to. This small change can help reduce confusion and improve the setup process.

Additionally, the refactoring efforts have led to cleaner code, making it easier for us to maintain and for contributors to understand. This means fewer bugs and a smoother experience for everyone!

Bug Fixes, Refactoring, and Feature Enhancements

We’ve tackled some pesky bugs in this release. For instance, the AI provider integration has been streamlined, ensuring that your AI-related commands work more reliably. The refactoring efforts also mean that the code is now more modular, which is a win for future development.

While there are no new features in this release, the improvements we’ve made are crucial for the stability and usability of the tool. We believe that these changes will significantly enhance your workflow when creating Jira issues.

What About Dependencies or Configurations?

No major changes were made to dependencies or configurations in this release. We’ve kept everything stable, ensuring that you can upgrade without worrying about breaking changes.

Release Info and Links

Here’s a quick summary of the release:

We hope you enjoy the improvements in this release! As always, your feedback is invaluable to us, so feel free to reach out with any thoughts or suggestions. Happy coding!


📚 Further Learning

🎥 Watch this video for more:


Generating Python Docstrings Automatically

Welcome to my blog! Today, I’m excited to share a neat Python script that automates the generation of docstrings for your Python files. This tool is especially useful for developers who want to maintain clean and well-documented code without spending too much time writing documentation manually.

The script utilizes OpenAI’s API to generate meaningful docstrings based on the code provided. It analyzes the structure of your Python classes and functions, and then creates concise and informative docstrings that follow standard conventions. Let’s dive into the code!


#!/usr/bin/python
import argparse
import hashlib
import json
import os
import re
import statistics
import subprocess
import sys
import tempfile
import time

import requests

system_prompt_general = """
You will be provided with a Python code.
Based on this, your task is to generate a Python docstring for it.
This is the docstring that goes at the top of the file.
The file may include multiple classes or other code.

Ensure the docstring follows Python's standard docstring conventions and provides
just enough detail to make the file understandable and usable without overwhelming the reader.

Please only return the docstring, enclosed in triple quotes, without any other explanation
or additional text. The format should be:

\"\"\"
<docstring content here>
\"\"\"

Make sure to follow the format precisely and provide only the docstring content.
"""

system_prompt_class = """
You will be provided with a Python class, including its code.
Based on this, your task is to generate a Python docstring for it.
Use the class signature and body to infer the purpose of the class,
the attributes it has, and any methods it includes.

Follow these guidelines to create the docstring:

1. Summary: Provide a concise summary of the class's purpose.
    Focus on what the class does and its main goal.
2. Attributes: List the attributes, their types, and a brief description
    of what each one represents.

Ensure the docstring follows Python's standard docstring conventions and provides
just enough detail to make the class understandable and usable without overwhelming the reader.

Please only return the docstring, enclosed in triple quotes, without any other
explanation or additional text. The format should be:

\"\"\"
<docstring content here>
\"\"\"

Make sure to follow the format precisely and provide only the docstring content.
"""

system_prompt_def = """
You will be provided with a Python function, including its code.
Based on this, your task is to generate a Python docstring for it.
Use the function signature and body to infer the purpose of the function,
the arguments it takes, the return value, and any exceptions it may raise.

Follow these guidelines to create the docstring:

1. Summary: Provide a concise summary of the function's purpose.
    Focus on what the function does and its main goal.
2. Arguments: List the parameters, their types, and a brief description
    of what each one represents.
3. Return: If the function has a return value, describe the return type
    and what it represents. If there's no return, OMIT THE SECTION.
4. Exceptions: If the function raises any exceptions, list them with descriptions.
    If no exceptions are raised, OMIT THE SECTION.
5. Side Effects (if applicable): If the function has side effects
    (e.g., modifies global state, interacts with external services), mention them.
    OMIT THE SECTION if it is not clear in the code.
6. Algorithm or Key Logic (optional): If the function is complex,
    provide a high-level outline of the logic or algorithm involved.
    OMIT THE SECTION if it is not clear in the code.

Ensure the docstring follows Python's standard docstring conventions and provides
just enough detail to make the function understandable and usable without overwhelming the reader.

Please only return the docstring, enclosed in triple quotes, without any other
explanation or additional text. The format should be:

\"\"\"
<docstring content here>
\"\"\"

Make sure to follow the format precisely and provide only the docstring content.
"""


class OpenAICost:
    # Static member to track the total cost
    cost = 0
    costs = []  # A list to store the individual request costs

    @staticmethod
    def send_cost(tokens, model):
        # The cost calculation can vary based on the model and tokens
        model_costs = {
            "gpt-3.5-turbo": 0.002,  # Example cost per 1k tokens
            "gpt-4o-mini": 0.003,  # Example cost per 1k tokens
            "gpt-4o": 0.005,  # Example cost per 1k tokens
        }

        cost_per_token = model_costs.get(model, 0)
        cost = (tokens / 1000) * cost_per_token  # Cost is proportional to tokens

        OpenAICost.cost += cost
        OpenAICost.costs.append(cost)

    @staticmethod
    def print_cost_metrics():
        print(f"\nTotal Cost: ${OpenAICost.cost:.4f}")
        if OpenAICost.costs:
            print(
                f"    Average Cost per Request: ${statistics.mean(OpenAICost.costs):.4f}"
            )
            print(f"    Max Cost for a Request: ${max(OpenAICost.costs):.4f}")
            print(f"    Min Cost for a Request: ${min(OpenAICost.costs):.4f}")
            if len(OpenAICost.costs) > 1:
                print(
                    f"    Standard Deviation of Cost: ${statistics.stdev(OpenAICost.costs):.4f}"
                )
        else:
            print("    No costs recorded.")


class OpenAIProvider:
    def __init__(self):
        self.api_key = os.getenv("JIRA_AI_API_KEY")
        if not self.api_key:
            raise EnvironmentError("JIRA_AI_API_KEY not set in environment.")
        self.endpoint = "https://api.openai.com/v1/chat/completions"
        self.model = os.getenv("OPENJIRA_AI_MODEL", "gpt-4")

    def estimate_tokens(self, text: str) -> int:
        tokens = len(text) // 3  # Using the 3 bytes per token estimation
        return tokens

    def select_model(self, input):
        tokens = self.estimate_tokens(input)

        if tokens < 1000:  # For small files (under ~1000 tokens)
            model = "gpt-3.5-turbo"
        elif tokens < 10000:  # For medium files (under ~10000 tokens)
            model = "gpt-4o-mini"
        else:  # For large files (over ~10000 tokens)
            model = "gpt-4o"

        OpenAICost.send_cost(tokens, model)

        return model

    def improve_text(self, prompt: str, text: str) -> str:
        headers = {
            "Authorization": f"Bearer {self.api_key}",
            "Content-Type": "application/json",
        }

        body = {
            "model": self.select_model(text),
            "messages": [
                {"role": "system", "content": prompt},
                {"role": "user", "content": text},
            ],
            "temperature": 0.5,
        }

        response = requests.post(self.endpoint, json=body, headers=headers, timeout=300)
        if response.status_code == 200:
            res = response.json()["choices"][0]["message"]["content"].strip()
            result = res
            # Count the occurrences of '"""'
            occurrences = result.count('"""')
            occurrences_backtick = result.count("```")

            # Check if there are more than two occurrences
            if occurrences > 2:
                # Find the positions of the first and second occurrences
                first_pos = result.find('"""')
                second_pos = result.find('"""', first_pos + 1)

                # Get everything from the first '"""' to the second '"""', inclusive
                result = result[first_pos : second_pos + 3]  # Include the second '"""'

            if occurrences_backtick > 2:
                # Find the positions of the first and second occurrences
                first_pos = result.find("```")
                second_pos = result.find("```", first_pos + 1)

                # Get everything from the first '"""' to the second '"""', inclusive
                result = result[first_pos : second_pos + 3]  # Include the second '"""'

            return result
        raise Exception(
            f"OpenAI API call failed: {response.status_code} - {response.text}"
        )


class Docstring:
    def __init__(self, file_path, debug=False, exit=False):
        self.file_path = file_path
        self.lines = []
        self.ai = OpenAIProvider()
        self.line_index = 0
        self.multiline_index = 0
        self.cache_file = "docstring.cache"
        self.debug = debug
        self.exit = exit

        with open(self.file_path, "r") as file:
            self.lines = file.readlines()

        # Ensure cache file exists, create if necessary
        if not os.path.exists(self.cache_file):
            with open(self.cache_file, "w") as cache:
                json.dump({}, cache)

        self._load_cache()

        print(" -> " + self.file_path)

    def print_debug(self, title, out):
        if not self.debug:
            return

        print("=====================================================")
        print("=====================================================")
        print(" > > " + title)
        print("=====================================================")
        out = "".join(out) if isinstance(out, list) else out
        print(out)
        print("=====================================================")
        print("=====================================================")

    def _load_cache(self):
        with open(self.cache_file, "r") as cache:
            self.cache = json.load(cache)

    def _save_cache(self):
        with open(self.cache_file, "w") as cache_file:
            json.dump(self.cache, cache_file, indent=4)

    def _generate_sha1(self, user_prompt):
        return hashlib.sha1(user_prompt.encode("utf-8")).hexdigest()

    def _get_current_timestamp(self):
        return int(time.time())

    def get_ai_docstring(self, sys_prompt, user_prompt, signiture):
        sha1_hash = self._generate_sha1(user_prompt)

        # Check if file is in cache
        if self.file_path in self.cache:
            # Check if the user prompt's SHA1 is in the self.cache for this file
            for entry in self.cache[self.file_path]:
                if entry["sha1"] == sha1_hash:
                    # Update last_accessed timestamp
                    entry["last_accessed"] = self._get_current_timestamp()
                    # Return cached docstring if found
                    return entry["docstring"]

        print("    Requesting AI for: " + signiture)
        # If no self.cache hit, call the AI and get the docstring
        res = self.ai.improve_text(sys_prompt, user_prompt)

        # Create a new self.cache entry with last_accessed timestamp
        new_entry = {
            "sha1": sha1_hash,
            "docstring": res,
            "last_accessed": self._get_current_timestamp(),
        }

        # Add new entry to the self.cache for the current file
        if self.file_path not in self.cache:
            self.cache[self.file_path] = []

        self.cache[self.file_path].append(new_entry)

        # Return the new docstring from AI
        return res

    def remove_old_entries(self, minutes):
        current_timestamp = self._get_current_timestamp()
        threshold_timestamp = current_timestamp - (minutes * 60)

        # Remove old entries for each file in the self.cache
        for file_path, entries in self.cache.items():
            self.cache[file_path] = [
                entry
                for entry in entries
                if "last_accessed" in entry
                and entry["last_accessed"] >= threshold_timestamp
            ]

    def wrap_text(self, text: str, max_length=120, indent=0):
        wrapped_lines = []
        lines = text.strip().splitlines()

        # Ensure indent is an integer
        try:
            indent = int(indent)
        except ValueError:
            indent = 0  # Default to 0 if it's an invalid value

        spacer = ""
        if indent == 0:
            spacer = ""
        else:
            spacer = " " * (indent * 4)

        for line in lines:
            line = spacer + line.strip()
            while len(line) > max_length:
                # Find last space to split at
                split_at = line.rfind(" ", 0, max_length)
                if split_at == -1:
                    split_at = max_length  # no space found, split at max_length
                wrapped_lines.append(line[:split_at].rstrip())
                line = spacer + line[split_at:].strip()
            wrapped_lines.append(line)
        return wrapped_lines

    def count_and_divide_whitespace(self, line):
        leading_whitespace = len(line) - len(line.lstrip())
        if leading_whitespace == 0:
            return 0
        return leading_whitespace // 4

    def complete(self):
        # Create a temporary file
        with tempfile.NamedTemporaryFile(delete=False) as temp_file:
            temp_file_path = temp_file.name
            temp_file.write(
                "".join(self.lines).encode()
            )  # Write the content to the temporary file

        try:
            # Attempt to compile the temporary file
            result = subprocess.run(
                ["python", "-m", "py_compile", temp_file_path],
                capture_output=True,
                text=True,
            )

            # If there is no compilation error (i.e., result.returncode == 0), move the file to the destination
            if result.returncode == 0:
                with open(self.file_path, "w") as file:
                    print(f"    Wrote: {self.file_path}")
                    file.write("".join(self.lines))  # Write to the destination file
            else:
                print(f"    Error compiling file: {result.stderr}")
                if self.debug:
                    name = "/tmp/" + os.path.basename(self.file_path) + ".failed"
                    with open(name, "w") as file:
                        file.write("".join(self.lines))
                        print(f"    Copied here: {name}")
                        if self.exit:
                            sys.exit(1)

        finally:
            # Clean up the temporary file
            os.remove(temp_file_path)

        self.remove_old_entries(1440 * 14)
        self._save_cache()

    def generate_class_docstring(self):
        line = self.lines[self.line_index]
        class_definition = line
        output = re.sub("\\s+", " ", class_definition.rstrip().replace("\n", " "))
        print("   -> " + output)
        prompt_class_code = [class_definition]

        if self.count_and_divide_whitespace(class_definition) > 0:
            self.line_index = self.line_index + 1
            return None

        t = self.line_index + 1
        # Collect all lines that belong to the class, including "pass" or single-line classes
        while (
            t < len(self.lines)
            and not self.lines[t].startswith("def")
            and not self.lines[t].startswith("class")
        ):
            prompt_class_code.append(self.lines[t].rstrip())
            t += 1

        class_docstring = self.get_ai_docstring(
            system_prompt_class, "\n".join(prompt_class_code), output
        )
        class_docstring = self.wrap_text(class_docstring, max_length=120, indent=1)
        class_docstring[len(class_docstring) - 1] = (
            class_docstring[len(class_docstring) - 1] + "\n"
        )
        class_docstring = [line + "\n" for line in class_docstring]
        class_docstring[len(class_docstring) - 1] = (
            class_docstring[len(class_docstring) - 1].rstrip() + "\n"
        )

        # Check for existing docstring and replace it
        docstring_start_index = None
        docstring_end_index = None

        # Look for the class docstring (the second line should start with """ if it's there)
        if self.lines and self.lines[self.line_index + 1].strip().startswith('"""'):
            # Docstring exists, find the end of it
            docstring_start_index = (
                self.line_index + 1
            )  # The docstring starts from line after the class definition
            for i, line in enumerate(
                self.lines[self.line_index + 2 :], start=self.line_index + 2
            ):
                if line.strip().startswith('"""'):
                    docstring_end_index = i  # End of the docstring
                    break

        # If a docstring exists, replace it
        if docstring_start_index is not None and docstring_end_index is not None:
            self.lines = (
                self.lines[:docstring_start_index]
                + self.lines[docstring_end_index + 1 :]
            )

        # Insert the new docstring after the class definition
        self.lines = (
            self.lines[: self.line_index + 1]
            + class_docstring
            + self.lines[self.line_index + 1 :]
        )

        # self.print_debug("class docstring: " + class_definition.strip(), self.lines)

        return True

    def generate_function_docstring(self):
        line = self.lines[self.line_index]
        mutliline_line = ""

        if not (
            line.strip().endswith("):")
            and not re.search(r"\)\s*->\s*(.*)\s*:.*", line.strip())
        ):
            self.multiline_index = 0
            # multiline def signiture
            while self.line_index < len(self.lines):
                mutliline_line += self.lines[self.line_index]
                if re.match(
                    r".*\):$", self.lines[self.line_index].strip()
                ) or re.search(
                    r".*\)\s*->\s*(.*)\s*:.*", self.lines[self.line_index].strip()
                ):
                    break
                self.line_index += 1
                self.multiline_index += 1

        def_definition = line if mutliline_line == "" else mutliline_line
        output = re.sub("\\s+", " ", def_definition.rstrip().replace("\n", " "))
        print("     -> " + output)
        prompt_def_code = (
            mutliline_line.split("\n") if mutliline_line != "" else [def_definition]
        )

        indent_line = self.count_and_divide_whitespace(
            def_definition if mutliline_line == "" else mutliline_line.splitlines()[0]
        )
        spacer_line = "" if indent_line == 0 else " " * (indent_line * 4)
        spacer_line_minus = "" if indent_line < 2 else " " * ((indent_line - 1) * 4)
        spacer_line_plus = "" if indent_line == 0 else " " * ((indent_line + 1) * 4)

        t = self.line_index + 1
        # Collect all self.lines that belong to the function
        while t < len(self.lines):
            starts_with_def = self.lines[t].strip().startswith("def")

            # same indent
            if starts_with_def and self.lines[t].startswith(spacer_line):
                break

            # outside
            if starts_with_def and self.lines[t].startswith(spacer_line_minus):
                break

            # nested
            if starts_with_def and self.lines[t].startswith(spacer_line_plus):
                pass

            if self.lines[t].rstrip() != def_definition.strip():
                prompt_def_code.append(self.lines[t].rstrip())
            t += 1

        # Now that we have the full function signature, we generate the docstring
        indent = (
            self.count_and_divide_whitespace(
                def_definition
                if mutliline_line == ""
                else mutliline_line.splitlines()[0]
            )
            + 1
        )
        def_docstring = self.get_ai_docstring(
            system_prompt_def, "\n".join(prompt_def_code), output
        )
        def_docstring = self.wrap_text(def_docstring, max_length=120, indent=indent)
        def_docstring[len(def_docstring) - 1] = (
            def_docstring[len(def_docstring) - 1] + "\n"
        )
        if def_definition.strip() == def_docstring[0].strip():
            def_docstring = def_docstring[
                1 if mutliline_line == "" else self.multiline_index :
            ]
        def_docstring = [line + "\n" for line in def_docstring]
        def_docstring[len(def_docstring) - 1] = (
            def_docstring[len(def_docstring) - 1].rstrip() + "\n"
        )

        # Handle one-liner docstring or multi-line docstring
        if '"""' in self.lines[self.line_index + 1]:
            stripped_line = self.lines[self.line_index + 1].strip()
            if re.match(r'"""[\s\S]+?"""', stripped_line):
                # This is a one-liner or multi-line docstring (we always replace with a multi-line docstring)
                self.lines = (
                    self.lines[: self.line_index + 1]
                    + def_docstring
                    + self.lines[self.line_index + 2 :]
                )
            else:
                # Replace the entire docstring if it's multi-line
                end_index = self.line_index + 2
                while end_index < len(self.lines) and not self.lines[
                    end_index
                ].strip().startswith('"""'):
                    end_index += 1

                if (
                    end_index < len(self.lines)
                    and self.lines[end_index].strip() == '"""'
                ):
                    # Found the end of the docstring, now replace the entire docstring block
                    self.lines = (
                        self.lines[: self.line_index + 1]
                        + def_docstring
                        + self.lines[end_index + 1 :]
                    )
        else:
            # If no docstring exists, simply insert the generated docstring
            self.lines = (
                self.lines[: self.line_index + 1]
                + def_docstring
                + self.lines[self.line_index + 1 :]
            )

        # self.print_debug("def docstring: " + def_definition.strip(), self.lines)

        self.line_index = self.line_index + len(def_docstring)
        return True

    def generate_file_docstring(self):
        # Check if we should add a file-level docstring
        # if not self.should_add_file_docstring():
        #     return 0  # Skip generating file-level docstring if not needed

        shebang = ""

        # Check if the first line starts with a shebang (e.g., #! anything)
        if self.lines and not self.lines[0].startswith("#!"):
            self.lines = ["#!/usr/bin/env python\n"] + self.lines
            shebang = "#!/usr/bin/env python\n"
        else:
            shebang = self.lines[0]

        # Check if there's already an existing file-level docstring or comment block
        # We assume the file-level docstring starts with triple quotes (""" or ''') and is at the top
        docstring_start_index = None
        docstring_end_index = None

        if self.lines and self.lines[1].strip().startswith('"""'):
            # If the second line starts with triple quotes, it may be a docstring
            docstring_start_index = 1  # The docstring starts from line 2
            for i, line in enumerate(self.lines[2:], start=2):
                if line.strip().startswith('"""'):
                    docstring_end_index = i  # End of the docstring
                    break

        # Generate new file-level docstring
        general_description = self.get_ai_docstring(
            system_prompt_general, "".join(self.lines), self.file_path
        )
        general_description = self.wrap_text(
            general_description, max_length=120, indent=0
        )
        docstring = [line + "\n" for line in general_description]

        # self.print_debug("docstring_end_index", str(docstring_end_index))
        # self.print_debug("self.lines[:docstring_start_index]", self.lines[:docstring_start_index])
        # self.print_debug("docstring", docstring)
        # self.print_debug("self.lines[docstring_end_index + 1 :]", self.lines[docstring_end_index + 1 :])

        # If a docstring exists, replace it with the new one
        if docstring_start_index is not None and docstring_end_index is not None:
            self.lines = (
                self.lines[:docstring_start_index]
                + docstring
                + self.lines[docstring_end_index + 1 :]
            )
        else:
            # Insert the generated docstring directly after the shebang (no extra newline)
            self.lines = [shebang] + docstring + self.lines[1:]

        # self.print_debug("file docstring: " + self.file_path, self.lines)

        return len(docstring)

    def generate_docstrings(self):
        if len(self.lines) == 0:
            return

        self.line_index = self.generate_file_docstring()

        while self.line_index < len(self.lines):
            line = self.lines[self.line_index]

            # For classes, generate class docstring
            if line.strip().startswith("class "):
                if not self.generate_class_docstring():
                    continue  # Skip to the next line

            # For functions, generate function docstring
            elif line.strip().startswith("def "):
                if not self.generate_function_docstring():
                    continue  # Skip to the next line

            self.line_index = self.line_index + 1

        self.complete()


def process_file(file_path, debug=False, exit=False):
    """Process a single file by generating docstrings."""
    Docstring(file_path, debug=debug, exit=exit).generate_docstrings()


def process_directory(directory_path, recursive=False, debug=False, exit=False):
    """Process all Python files in the directory with progress tracking."""

    # List all python files
    python_files = [
        os.path.join(root, file)
        for root, dirs, files in os.walk(directory_path)
        for file in files
        if file.endswith(".py")
    ]

    total_files = len(python_files)  # Total python files count
    processed_files = 0

    print(f"Processing {total_files} Python files...")

    for file_path in python_files:
        process_file(file_path, debug=debug, exit=exit)
        processed_files += 1
        print(f"\nProcessing file: {processed_files}/{total_files}")

        # If we don't want recursive traversal, we break the loop once we're done with this level
        if not recursive:
            break


def main():
    # Set up the argument parser
    parser = argparse.ArgumentParser(
        description="Generate file-level docstrings for Python files."
    )
    parser.add_argument("path", help="Path to a Python file or directory.")
    parser.add_argument(
        "-r",
        "--recursive",
        action="store_true",
        help="Recursively process all Python files in the directory.",
    )
    parser.add_argument(
        "-d",
        "--debug",
        action="store_true",
        help="Copies failed updates to /tmp/",
    )
    parser.add_argument(
        "-e",
        "--exit",
        action="store_true",
        help="Exits on failure",
    )
    # Parse the arguments
    args = parser.parse_args()

    # Check if the path is a file or directory
    if os.path.isfile(args.path):
        # If it's a file, process it directly
        process_file(args.path, debug=args.debug, exit=args.exit)
    elif os.path.isdir(args.path):
        # If it's a directory, process all Python files
        process_directory(
            args.path, recursive=args.recursive, debug=args.debug, exit=args.exit
        )
    else:
        print(f"Error: {args.path} is neither a valid file nor a directory.")

    OpenAICost.print_cost_metrics()


if __name__ == "__main__":
    main()

Now, let’s break down how this script works:

  • OpenAICost Class: This class tracks the cost of using the OpenAI API based on the number of tokens used. It provides methods to send cost data and print cost metrics.
  • OpenAIProvider Class: This class handles the interaction with the OpenAI API. It estimates the number of tokens in the input text, selects the appropriate model based on the token count, and sends requests to improve the text.
  • Docstring Class: This is the heart of the script. It reads the Python file, generates docstrings for classes and functions using the OpenAI API, and saves the updated file. It also manages a cache for previously generated docstrings to optimize performance.
  • Main Functionality: The script can process a single file or an entire directory of Python files, generating docstrings for each. It includes command-line arguments for flexibility.

This tool can significantly enhance your coding workflow by ensuring that your code is well-documented and easy to understand. I hope you find it as useful as I do! Happy coding!


📚 Further Learning

🎥 Watch this video for more: