| Data Streaming allows you to replicate data from your Crunchtime products to a Snowflake data warehouse. This article answers the most common questions about setup, technical requirements, operations, and support for the Data Streaming service. |
🔍 Navigate this article: |
General Information
What Crunchtime products can I use for this?
Inventory, Labor, and Zenput.
Does this include the Cruise product?
Yes, where Crunchtime hosts the environment.
Can I use Data Streaming with my testing environments?
No. This service is only available for the production environment.
What does Data Streaming include?
Crunchtime will manage the streaming data replication to a Snowflake data warehouse operated by the customer. This includes:
- Monitoring the stream and availability of data delivery to the Snowflake data warehouse.
- Synchronizing eligible tables and columns to the Snowflake data warehouse orchestrated with schema changes in the products as new features are created.
What does "stream" mean?
It means that change logs are monitored on the production transaction system and changes are moved to the Snowflake environment as they are found. This is not a "batch" process.
Data and Tables
If I'm a legacy Inventory, Labor, or Cruise customer with direct database access, will I have access to the same tables as I do today?
No. We will stream customer data for the transactions and configuration. We will not stream changes for tables associated with:
- Internal logging tables
- Internal processing tables for orchestration of product features
- Internal product meta data tables
Is there an exact table list available?
Yes, this is provided.
- Table list for Inventory & Labor
- Table list for Zenput
If a table is not present in the list by default, can it be added?
No. We will only be supporting the tables listed in the table lists.
If we do not want all of the tables listed, can we reduce the number of tables?
No. It is all or nothing.
What happens when new features are added to the Inventory, Labor and Zenput products resulting in new data?
We will add tables to the stream, where the new tables contain customer data. The same is true for new columns to existing streamed tables.
What latency should be expected for the data freshness?
Up to 4 hours for Inventory and Labor and up to 12 hours for Zenput. This means a change made in the system will become visible in the Snowflake data warehouse within 4 or 12 hours of creation.
Snowflake Setup and Requirements
If I don't have a Snowflake data warehouse, how do I get one?
It is the customer responsibility to procure a Snowflake environment from Snowflake. The customer is responsible for operating a Snowflake environment. Additionally, the customer must grant Crunchtime all the permissions necessary to fully manage the warehouse schema that is provisioned as the target of the data stream. This is necessary for Crunchtime to maintain the tables and columns changes as the underlying product changes.
How much does a Snowflake environment cost?
Please contact Snowflake. Crunchtime is not an authorized reseller. Your cost will vary based on usage and operational decisions that Snowflake can assist you with.
Are other databases available for this service besides Snowflake?
No. We choose Snowflake because it is an industry leader in data lakes and reporting, in addition to supporting egress from Snowflake to virtually every popular database.
We are not familiar with the Snowflake concepts and a "One Big Table" design. Can Crunchtime support this for us?
A Professional Service engagement can be evaluated. Contact Customer Success or your Implementation Services team to learn how to start this process.
Security and Operations
Is the data transmitted to Snowflake secure?
Yes, SSL is used.
Can a VPN be used in conjunction with the data stream?
No, we are not considering this at this time.
How can I protect the reporting created in the Snowflake data warehouse from unplanned events such as reseeding?
It is recommended that the schema target for the stream data be different from the schema that would own the reporting models. By having two schemas, and a data modeling process that leverages a "One Big Table" design you can insulate your reporting structures from the data stream. This will allow your reporting to continue to operate with data "as of" the termination of the stream, while it is being reseeded.
Troubleshooting and Risks
Are there any operational risks that should be known about this service?
Yes. The streaming of data relies on a cache and that cache is not infinite. If the streaming is interrupted for too long it will result in the system becoming unresponsive.
What is "too long"?
It depends on a per customer basis since the number of locations and features within our products are used differently by each customer. We plan to be able to support an interruption of up to 4 hours.
If an interruption will exceed 4 hours, what happens?
The Crunchtime team will terminate the stream. This will result in the Snowflake environment needing to be reseeded, where all the tables in the Snowflake data warehouse are removed and recreated.
How long does a "reseed" take?
It is dependent on the amount of data and may range from a few hours to a couple weeks. Large tables require constant oversight by a DBA to avoid failures requiring the reseeding to be aborted and restarted.
Implementation and Support
How long does Data Streaming take to implement?
Implementation will vary between a few days and a couple weeks, which includes the initial data seeding and establishing the required access to the customer managed Snowflake data warehouse. The total time depends on the amount of data in the product database, which would need to be estimated on a customer-by-customer basis.
Does Crunchtime offer support to Data Streaming customers?
Crunchtime provides limited support for Data Streaming.
Comments
0 comments
Please sign in to leave a comment.