ShellStat

Phani Kumar Yadavilli
3 min readAug 17, 2023

--

Over the weekend, while reviewing some systems logs of my Mac, I came across one exciting use case. I was trying to run some commands, and during this time, I was trying to remember what commands I was executing on the terminal frequently. What if I build a tool that gives me insights about the commands frequently used, the potentially harmful commands, the commands which can be optimised, and the patterns for the command sequence. What's more exciting is its evolution over the last few days. As the ideas are flowing, I would like to jot them down, and discuss the design and architectural patterns I am using to build the features to get feedback from the community.

Introducing ShellStat, an analytics tool designed to provide insights on shell history, offering a unique perspective on user behaviour, trends, and efficiency within the terminal environment.

Started off the development with some primary objectives of in mind. But, there is a lot of scope for extensions

  1. Realtime Analytics: Provide realtime insights into shell history, enabling users to instantly recognize the patterns, preferences and potential areas of improvement in their terminal area.
  2. User Behaviour: Gain a deeper insight on user behaviour within the terminal. This can help in personalizing the shell environments, optimizing workflows and enhancing overall terminal productivity.
  3. Historical Data Analysis: Maintain a robust and efficient storage mechanism, allowing users not only analyze the current shell interactions but also dive into historical data for more trends and patterns.
  4. Versatality & Adaptability: Ensure the ShellStat is adaptable across different terminal environments interpreting various shell syntaxes.
  5. Resource Efficiency: The design should not be resource intensive, ensuring smooth system performance even when peak analytical operations are carried.

Some of the challenges, I faced while designing ShellStat:

  1. Data Ingestion: Real-Time/Near-Real-Time data ingestion from the shell history without introducing performance issues in the users machine.
  2. Data Storage: Storing the growing data without impacting the user system performance by consuming less resources.
  3. Data Parsing: Parse the data to get relevant information from the shell history given the fact that there is a possibility of having complex commands.
  4. Customized Analytics: The tool should cater to wide range of analytical requirements of the users while maintaining simplicity and user-friendly interface.
  5. Scalability & Evolution: As the shell environments evolve the tool has to remain relevant, efficient, and up-to-date without demanding frequent or significant overhauls.

Tools & Technologies:

  1. SQLite3: SQLite3 is the primary storage system for ShellStat. It offers efficient way to store and retrieve data, making it suitable for relatime analytics.
    Advantages: Lightweight, Serverless, Self-Contained and Transactional. Its a compact standalone data storage without needing us to setup a database server.
    https://www.sqlite.org/index.html
  2. Python: Primary programming language for developing ShellStat. Used for building parsers, analysers, file watcher for consuming realtime events, and interface with SQLite.
  3. Watchdog: Monitors shell history file for any changes, triggering data ingestion and processing in real-time.
    Advantages: Supports multiple linux and nix platforms. Enables developer to event source filesystem events while the commands are executed on the shell.
    https://pypi.org/project/watchdog/
  4. Potential Backup & Restore Solution (S3): For backup of SQLite database files to ensure availability, durability, distributed and fault tolerant storage solutions.

Additional Considerations:

  1. Web Framework (Flask): For serving the web dashboard.
  2. Encryption Libraries: For securing sensitive data and privacy for user specific data.

https://github.com/wandermonk/ShellStat

--

--

Phani Kumar Yadavilli
Phani Kumar Yadavilli

Written by Phani Kumar Yadavilli

I am a Big Data Analytics Engineer passionate about writing good code and building highly scalable distributed systems.

No responses yet