Skip to content

DB and Storage

Cloud Storage

Object storage bucket. Any amount of data and retrieve any number of time. Storing BLOB is ideal. Data stored is immutable, only new version or overwrite. Allows versioning. Cached at edge location by default Access can be manged by IAM and ACL Lifecycle management policies to manage data, like removing old, controlling number of versions etc Directory sync enables VM directory sync

Types of Cloud Storage

  • Standard : Frequently accessed data
  • Nearline : Infrequent like once per month
  • Coldline : Access once every 90 days
  • Archive : Ideal for DR, accessed less than once year. Higher access cost.

Autoclass

  • Saves cost by moving infrequent data to lower cost storage
  • Storage transfer service : Allow larger data transfer.
  • Transfer Appliance : Hardware appliance to transfer data.

Filestore

NFS V3 compatible file share.

Cloud SQL

Fully managed Mysql, postgress SQL or SQL server

  • Automatic replication
  • data encrypted
  • Includes Network firewall
  • Point in time recovery option.
  • Cross region replica for DR

Cloud Datastore

 NoSQL with index and queries (dynamodb)

Alloy DB

A faster SQL compatible DB managed by google.

Cloud Spanner

SQL compatible Relational database.

  • Scales horizontally
  • Very consistent
  • SQL compatible
  • High IP operations
  • High cost

Big query

Peta-scale data warehouse designed to injest, store, analyze and visualize

  • SQL compatible
  • Can read from Cloud BigTable
  • Query editor displays estimated amount data used, allowing to understand cost
  • Supports external data sources like Bigtable, cloud storage etc. Can query without bringing data in including other ext.
  • Federated queries: Allows query to be send to SQl, CloudSpanner etc.

Big Query Roles

  • Admin :Manage all resources, cancel jobs, manage all data in project
  • ConnectionAdmin/User
  • dataEditor : Create edit, delete data tables
  • DataOwner : Metadata as well
  • jobUser : Run jobs
  • user : Make use of data, run jobs etc

INFORMATION_SCHEMA views in BigQuery provide metadata about your BigQuery resources, including datasets, jobs, and tables

Bigtable

No SQL wide-column DB optimized for heavy read and write. Handles huge data volumes with low latency. Ideal for Operational and machine learning applications. Data stored in key:value Ideal if

  • more than 1 TB data
  • Data is fast throughput, or rapidly changing.
  • Time series or data with semantic ordering
  • Realtime processing of large data
  • Machine learning algorithm

Firestore

No SQL database for mobile, web, IoT services. Data is stored in documents and organized in to collections.

  • Stored as key:value pair. No SQL queries to retrieve data.
  • Indexed by default
  • Data is cached and get updated when device is online.
  • Billing based on storage, queries, documents read
  • ACID transaction. Ensure entire transaction consistency

Memory store (Redis)

  • Managed redis
  • In memory data service
Option Best for Capacity
Cloud storage Immutable blobs larger than 5mb. in buckets Petabyts, 5TB biggest object
Cloud SQL Full SQL, for webframe work or existing applications 64TB
Cloud Spanner SQL compatible, horizontal scaling. Petabytes
Firestore Massive scaling with real time query. Offline query support Petabytes, max object size 1MB
Cloud Bigtable Large amount of analytical data. No SQL Petabytes

Dataproc

  • Managed Hadoop, spark,
  • Opensource data science

Dataflow

  • Streaming analytical data
  • Realtime AI/ML

Pub/sub

  • Must create a topic and subscription