How to Pretty Print JSON on the Command Line

JSON is a very popular platform independent data format. One of the great benefits of working with JSON is that it is generally easy to read. However, reading JSON objects becomes more difficult as the objects become large, especially on the command line. Pretty printing JSON records on the command line makes reading it much

read more

How to Stop and Delete all Docker Containers

People that work with Docker know it is easy to create a large number of containers. Occasionally it becomes necessary to delete unused and unneeded containers. Below is an example of stopping and deleting all existing Docker containers. This is accomplished by using docker stop and docker rm together with docker ps (using command substitution).

read more

Count Number of Occurrences of Characters in Line with AWK

Being able to count the number of occurrences of characters or words in text is a handy trick. Fortunately this is very easy to do in awk with the gsub() function. The syntax for using gsub() looks like this: gsub(regexp, replacement [, target]) gsub() will search target for substrings matching the provided regular expression and

read more

Writing to a Database from Spark

One of the great features of Spark is the variety of data sources it can read from and write to. If you already have a database to write to, connecting to that database and writing data from Spark is fairly simple. This example shows how to write to database that supports JDBC connections. Databases Supporting

read more

Loading Data from a Database into Spark

One of the great features of Spark is the variety of data sources it can read from. Loading data from a database into Spark using JDBC requires 3 major steps. First you need a running database that support JDBC connections. Next you will need to download and use the JDBC driver of that database. Finally

read more

How to Load Text Files into MySQL

MySQL databases are often populated by loading text files directly into tables. MySQL makes this very easy to do with the LOAD DATA INFILE statements. For example:

LOAD DATA INFILE Statements LOAD DATA INFILE statements can read data into MySQL tables at very high speeds. This will be much faster than running many single

read more

Execute MySQL Scripts on the Command Line

When working with MySQL it is often necessary to execute SQL statements or scripts programmatically from the command line. SQL statements can be included in a SQL script (text) file and executed by the MySQL Client. This can be done in a few different ways. Execute MySQL Script File from Standard Input Statements in a

read more

Connecting Docker Containers with Networking

Docker Containers are isolated environments often running a single service. Usually individual containers need to be connected to others to accomplish more complex tasks. Starting in Docker 1.9 this has become very easy to do with Networking. Docker Networking Docker Networking is a feature that allows you to create a virtual network and attach containers

read more

Select Random Lines from file in Linux

Being able to select random lines from a file in Linux can be very helpful and convenient. There are a few easy ways to do this, including using the shuf utilty. shuf is included on many Linux/Unix systems as part of GNU coreutils. GNU coreutils can be easily installed if not already. Selecting Random Lines

read more

Set AWS Credentials in Cloudera Quickstart Docker Container

Cloudera’s Quickstart Image is a fantastic way to get started quickly with the big data ecosystem. With software such as Hadoop, Spark, Hive, Pig, Impala, and Hue already set up, this Docker image is a must in your big data toolkit. One thing the Cloudera Quickstart container is lacking however, is an easy way to

read more