Panfactum Feature Roadmap

Next Stable Release

Release Date: 11/29/2024

First Release Candidate: 11/15/2024

Our last stable release production-hardened the foundational components of the Panfactum Stack and was our first foray into self-service guides that would allow organizations to deploy and manage the Stack with ease and confidence. With the Stack foundations solidified, we now turn our focus to five new goals for the next release:

Extend the base stack with high value but optional "addons" that unlock new capabilities for your organization
Invest in documentation with many additional guides, tutorials, and reference materials
Improve the ergonomics of the onboarding experience for new users
Fully optimize any Panfactum-associated infrastructure costs
Add system telemetry to help us make inform decisions on where to focus product development efforts

New Addons

The basic stack components provide a powerful starting point for managing infrastructure on AWS and Kubernetes. Using that foundation, we can now provide easy-to-deploy, fully-integrated addons that can be used to replace expensive and/or ineffective third-party managed services. Here is a preview of what is coming:

Monitoring and Observability: A full monitoring and observability system based on the Grafana ecosystem that will collect metrics, logs, and traces from all system components. Data will be available in an integrated dashboarding solution, and we will provide over 25 out-of-the-box dashboards for analyzing system health.
Workflow Engine: A flexible, and full-featured workflow engine and Web UI based on Argo Workflows. Workflows can be used for CI / CD, data pipelines, and integrated into your application's business logic. We will also provide IaC submodules for configuring common workflows such as deploying your IaC, building container images, etc.
BuildKit: BuildKit is the world's most efficient container image builder, and our addon will unlock the ability to build container image both locally and in your CI pipelines at 10x the speed as any other build system. Moreover, it will automatically be configured for multi-platform images allowing your generated images to run on both arm64 and amd64 nodes for optimal cost savings.
KEDA: Our KEDA addon will unlock a new autoscaling ability: scale-to-zero. This will allow your async system components (e.g., queue consumers) to run much more efficiently. Moreover, KEDA allows you to better align your horizontal scaling with actual load (e.g., messages in a queue) rather than proxy metrics (e.g., CPU and memory).
Self-hosted GitHub Actions: While our primary focus for CI / CD investment will be in Argo Workflows, there is no denying the ubiquity and simplicity of GitHub actions. To offer organizations the option to continue using their existing GHA workflows, we will provide an addon for self-hosted GHA runners that will greatly improve performance and reduce cost compared to the GHA community runners.

Documentation Investments

One of our core values is ensuring that we provide sufficient documentation to ensure that anyone using the Stack understands how it works, how to use it, and how to extend it. While we feel we are doing a great job compared to Panfactum alternatives, we are only 5% to our dream state. As a result, we are doubling down on docs in this next release cycle.

In particular, we are focusing on the following:

Tutorials for how to write first-party IaC modules for the Stack and how to leverage the Stack for various software development goals
Reference documentation for every IaC module in the system
High-level architecture documentation for the various subsystems in the Stack
Runbooks for common Stack operations (e.g., restoring from database backups, provisioning new users, etc.)

Also, by overwhelming popular request: we will be adding search!

Improved Onboarding Experience

As we have been onboarding more and more users, it has become apparent that our onboarding experience is not as straightforward as we would have liked. As a result, we are adding the following enhancements:

Automate any manual steps that did not add value to the user in understanding the overall system
Significantly improve the devenv setup speed on MacOS
Provide more sane defaults to IaC module options to reduce decision friction
Add additional callouts of sharp edges that commonly impede new users
Address any and all bugs found in the bootstrapping section

Cost Savings

We want the Panfactum Stack to not just be the most efficient way to run software in AWS, but we want to be 10% of the cost of any alternative. While we were already close to that goal, we want to push the boundaries event further.

To that end, we are adding the following enhancements:

All Stack components will be sufficiently hardened so that they can run exclusively on extremely cheap spot instances
All the Stack components will run on cheaper arm64 nodes by default
Discounted burstable nodes
A custom bin-packing Kubernetes scheduler to replace the inefficient default scheduler provided by AWS
A new input variable for many IaC modules, enhanced_ha_enabled, that can be set to false in non-critical environments to significantly reduce costs for a minor hit in overall availability.
A new suspend command that can be used to temporarily turn off an environment to reduce costs while not in use.
Database optimizations to reduce cross-AZ network traffic
OpenCost to collect granular data on the costs of running every individual workload

Telemetry

While we believe our self-hosted architecture is the best approach for our users, one disadvantage we have compared to cloud services is a lack of visibility into how our system is being used.

This hinders our ability to decide where to invest our efforts. The last thing that we want to do is spend months working in areas that don't improve the experience for our users and ignore areas that do.

However, this desire for visibility cannot come at the cost of our users' privacy and security in their mission-critical systems.

Ultimately, we landed on the following approach:

Collect basic metadata only through static analysis of our user's IaC and CaC (e.g., what modules are deployed and how many times); never collect information from live or running systems.
Collect errors that occur in Panfactum-provided scripts. Ensure that error reports are full sanitized client-side, so they do not contain any sensitive information.
Provide explicit documentation about exactly what metadata is collected and how each item is used. Ensure the telemetry code is source-available, just as the rest of the Stack.
Allow enterprise plan customers the ability to easily disable the feature since we already have a close relationship with their users.

If you have feedback on this roadmap item please feel free to each out on our Discord with your thoughts. We want to implement this in the most transparent and benign way possible.

Feature Requests

All feature requests are recorded in our issue tracker. Features are prioritized based on the following criteria:

Customer tier
Number of issue upvotes 👍