From the blog

How-Tos, Deep-Dives, Brain-Dumps, and More

GITHUB, ACTIONS, PHP, BLOG

Fix the style of your code using Github Actions

In this article, we will leverage Github Actions to fix our PHP, JS, and CSS code style.

For most of my projects, I like to use styleci.io. Recently I was working on a project where I cannot use StyleCI because of compliance. Since our project was hosted on Github, we created Github workflows to replicate the functionality. Our project had two separate repositories, one for back-end API and one for our SPA frontend. So, we will be using two different Github workflows to style our codebase. But you can always combine it into one workflow depending on your requirement.

I have created a Demo Github repository that contains a working example of this concept. Let's discuss those two workflows.

Style PHP

This workflow fixes your PHP code to follow standards. We are using PHP-CS-Fixer for the same.

We will need to do three things.

Step 1

Let's start by adding the FriendsOfPHP/PHP-CS-Fixer composer package to our codebase.

composer require friendsofphp/php-cs-fixer --dev

Step 2

Create a Github workflow inside the .github/workflows directory. I named my workflow as style-php-cs-fixer.yml.

name: Style PHP-CS-Fixer

on:
  push:
    branches: [ master ]
  pull_request:
    branches: [ master ]

jobs:
  style:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
        with:
          ref: ${{ github.head_ref }}

      - name: Install composer packages
        run: composer install

      - name: Run PHP-CS-Fixer Style
        run: ./vendor/bin/php-cs-fixer fix

      - uses: stefanzweifel/git-auto-commit-action@v4
        with:
          commit_message: Apply PHP-CS-Fixer changes

Step 3

Create a .php_cs file with the Rule Sets. You can view the list of available rules here.

I used the rule set that I use with most of my PHP/Laravel projects.

<?php

use PhpCsFixer\Config;
use PhpCsFixer\Finder;

$rules = [
    'array_syntax' => ['syntax' => 'short'],
    'binary_operator_spaces' => [
        'default' => 'single_space',
        'operators' => ['=>' => null]
    ],
    'blank_line_after_namespace' => true,
    'blank_line_after_opening_tag' => true,
    'blank_line_before_statement' => [
        'statements' => ['return']
    ],
    'braces' => true,
    'cast_spaces' => true,
    'class_attributes_separation' => [
        'elements' => ['method']
    ],
    'class_definition' => true,
    'concat_space' => [
        'spacing' => 'none'
    ],
    'declare_equal_normalize' => true,
    'elseif' => true,
    'encoding' => true,
    'full_opening_tag' => true,
    'fully_qualified_strict_types' => true, // added by Shift
    'function_declaration' => true,
    'function_typehint_space' => true,
    'heredoc_to_nowdoc' => true,
    'include' => true,
    'increment_style' => ['style' => 'post'],
    'indentation_type' => true,
    'linebreak_after_opening_tag' => true,
    'line_ending' => true,
    'lowercase_cast' => true,
    'lowercase_constants' => true,
    'lowercase_keywords' => true,
    'lowercase_static_reference' => true, // added from Symfony
    'magic_method_casing' => true, // added from Symfony
    'magic_constant_casing' => true,
    'method_argument_space' => true,
    'native_function_casing' => true,
    'no_alias_functions' => true,
    'no_extra_blank_lines' => [
        'tokens' => [
            'extra',
            'throw',
            'use',
            'use_trait',
        ]
    ],
    'no_blank_lines_after_class_opening' => true,
    'no_blank_lines_after_phpdoc' => true,
    'no_closing_tag' => true,
    'no_empty_phpdoc' => true,
    'no_empty_statement' => true,
    'no_leading_import_slash' => true,
    'no_leading_namespace_whitespace' => true,
    'no_mixed_echo_print' => [
        'use' => 'echo'
    ],
    'no_multiline_whitespace_around_double_arrow' => true,
    'multiline_whitespace_before_semicolons' => [
        'strategy' => 'no_multi_line'
    ],
    'no_short_bool_cast' => true,
    'no_singleline_whitespace_before_semicolons' => true,
    'no_spaces_after_function_name' => true,
    'no_spaces_around_offset' => true,
    'no_spaces_inside_parenthesis' => true,
    'no_trailing_comma_in_list_call' => true,
    'no_trailing_comma_in_singleline_array' => true,
    'no_trailing_whitespace' => true,
    'no_trailing_whitespace_in_comment' => true,
    'no_unneeded_control_parentheses' => true,
    'no_unreachable_default_argument_value' => true,
    'no_useless_return' => true,
    'no_whitespace_before_comma_in_array' => true,
    'no_whitespace_in_blank_line' => true,
    'normalize_index_brace' => true,
    'not_operator_with_successor_space' => true,
    'object_operator_without_whitespace' => true,
    'ordered_imports' => ['sortAlgorithm' => 'alpha'],
    'phpdoc_indent' => true,
    'phpdoc_inline_tag' => true,
    'phpdoc_no_access' => true,
    'phpdoc_no_package' => true,
    'phpdoc_no_useless_inheritdoc' => true,
    'phpdoc_scalar' => true,
    'phpdoc_single_line_var_spacing' => true,
    'phpdoc_summary' => true,
    'phpdoc_to_comment' => true,
    'phpdoc_trim' => true,
    'phpdoc_types' => true,
    'phpdoc_var_without_name' => true,
    'psr4' => true,
    'self_accessor' => true,
    'short_scalar_cast' => true,
    'simplified_null_return' => false, // disabled by Shift
    'single_blank_line_at_eof' => true,
    'single_blank_line_before_namespace' => true,
    'single_class_element_per_statement' => true,
    'single_import_per_statement' => true,
    'single_line_after_imports' => true,
    'single_line_comment_style' => [
        'comment_types' => ['hash']
    ],
    'single_quote' => true,
    'space_after_semicolon' => true,
    'standardize_not_equals' => true,
    'switch_case_semicolon_to_colon' => true,
    'switch_case_space' => true,
    'ternary_operator_spaces' => true,
    'trailing_comma_in_multiline_array' => true,
    'trim_array_spaces' => true,
    'unary_operator_spaces' => true,
    'visibility_required' => [
        'elements' => ['method', 'property']
    ],
    'whitespace_after_comma_in_array' => true,
];

$project_path = getcwd();
$finder = Finder::create()
    ->in([
        $project_path
    ])
    ->name('*.php')
    ->notName('*.blade.php')
    ->ignoreDotFiles(true)
    ->ignoreVCS(true);

return Config::create()
    ->setFinder($finder)
    ->setRules($rules)
    ->setRiskyAllowed(true)
    ->setUsingCache(true);

You will need to modify the path on line number 129 to match your codebase. Here is an example rule sets that laravel shit uses.

Now our PHP Style is all set.

Style JS

This workflow fixes your JS, CSS, and Vue code to follow standards. We are using Prettier for the same.

We will need to do three things. Install the Prettier package. Configure our Github workflow. Configure our Prettier configuration file

Step 1

Let's start by adding the Prettier package to our codebase.

yarn add --dev --exact prettier

Step 2

Create a Github workflow inside the .github/workflows directory. I named my workflow as style-prettier.yml.

name: Style Prettier

on:
  push:
    branches: [ master ]
  pull_request:
    branches: [ master ]

jobs:
  style:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
        with:
          ref: ${{ github.head_ref }}

      - name: Install node packages
        run: yarn

      - name: Run fixer
        run: yarn prettier --write *.{css,js,vue}

      - uses: stefanzweifel/git-auto-commit-action@v4
        with:
          commit_message: Apply prettier changes

Step 3

Create a .prettierrc configuration file.

{
    "printWidth": 100,
    "singleQuote": true,
    "tabWidth": 4,
    "trailingComma": "es5"
}

Result

To verify that our workflows perform the desired actions. I have created a pull request will minor changes to the PHP, JS, CSS, and VUE files.

As you can see, our workflow corrected our code style and added two new commits on our pull request.

You can review what files were changed in each commit.

  • dd2333e - Applied Prettier changes.
  • f34129e - Applied PHP-CS-Fixer changes.

4 min read

DATABASE, PHP, LARAVEL, BLOG

Laravel - Creating a date dimension (calendar) table

What is a date dimension table

Date Dimension is a table that has one record per each day, no more, no less! Date dimension plays a vital role in your data warehouse designing, it provides the ability to study behavior and trend of your data over a period of time.

Most BI tools have an in-built date dimension table, but sometimes you might want to create one. Date dimension table is only created and loaded once since the data is not changed that often.

This article is inspired by the article "Creating a date dimension or calendar table in SQL Server". The article is specific for MS SQL database, but in our example, we'll create one using Laravel and Carbon so it can work with any kind of database driver.

We will be creating the following files with Laravel.

  • Migration - To create the table.
  • Model - To reference the table.
  • Console Command - To populate the table.

Step 1 - Create the migration

The columns in this table are self-explanatory, you can also google them if you want to learn more.

<?php

use Illuminate\Support\Facades\Schema;
use Illuminate\Database\Schema\Blueprint;
use Illuminate\Database\Migrations\Migration;

class CreateDateDimensionsTable extends Migration
{
    /**
     * Run the migrations.
     *
     * @return void
     */
    public function up()
    {
        Schema::create('date_dimensions', function (Blueprint $table) {
            $table->date('date')->primary();
            $table->unsignedInteger('day');
            $table->unsignedInteger('month');
            $table->unsignedInteger('year');
            $table->string('day_name');
            $table->string('day_suffix', 2);
            $table->unsignedInteger('day_of_week');
            $table->unsignedInteger('day_of_year');
            $table->unsignedInteger('is_weekend');
            $table->unsignedInteger('week');
            $table->unsignedInteger('iso_week');
            $table->unsignedInteger('week_of_month');
            $table->unsignedInteger('week_of_year');
            $table->unsignedInteger('iso_week_in_year');
            $table->string('month_name');
            $table->string('month_year');
            $table->string('month_name_year');
            $table->date('first_day_of_month');
            $table->date('last_day_of_month');
            $table->date('first_day_of_next_month');
            $table->unsignedInteger('quarter');
            $table->string('quarter_name');
            $table->date('first_day_of_quarter');
            $table->date('last_day_of_quarter');
            $table->date('first_day_of_year');
            $table->date('last_day_of_year');
            $table->date('first_day_of_next_year');
            $table->unsignedInteger('dow_in_month');
            $table->timestamps();
        });
    }
    
    /**
     * Reverse the migrations.
     *
     * @return void
     */
    public function down()
    {
        Schema::dropIfExists('date_dimensions');
    }
}

Step 2 - Create the model

In the model we are also typecasting the attributes, so it sets/gets the value in the correct format.

<?php

namespace App\Models;

use Illuminate\Database\Eloquent\Model;
use Illuminate\Database\Eloquent\SoftDeletes;

class DateDimension extends Model
{
    /**
     * Indicates if the IDs are auto-incrementing.
     *
     * @var bool
     */
    public $incrementing = false;
  
    /**
     * The "type" of the primary key ID.
     *
     * @var string
     */
    protected $keyType = 'date';
  
    /**
     * The primary key associated with the table.
     *
     * @var string
     */
    protected $primaryKey = 'date';
    
    /**
     * The attributes that are mass assignable.
     *
     * @var array
     */
    protected $fillable = [
        'date',
        'day',
        'month',
        'year',
        'day_name',
        'day_suffix',
        'day_of_week',
        'day_of_year',
        'is_weekend',
        'week',
        'iso_week',
        'week_of_month',
        'week_of_year',
        'iso_week_in_year',
        'month_name',
        'month_year',
        'month_name_year',
        'first_day_of_month',
        'last_day_of_month',
        'first_day_of_next_month',
        'quarter',
        'quarter_name',
        'first_day_of_quarter',
        'last_day_of_quarter',
        'first_day_of_year',
        'last_day_of_year',
        'first_day_of_next_year',
        'dow_in_month',
    ];
 
    /**
     * The attributes that should be cast to native types.
     *
     * @var array
     */
    protected $casts = [
        'date' => 'date:Y-m-d',
        'first_day_of_month' => 'date:Y-m-d',
        'last_day_of_month' => 'date:Y-m-d',
        'first_day_of_next_month' => 'date:Y-m-d',
        'first_day_of_quarter' => 'date:Y-m-d',
        'last_day_of_quarter' => 'date:Y-m-d',
        'first_day_of_year' => 'date:Y-m-d',
        'last_day_of_year' => 'date:Y-m-d',
        'first_day_of_next_year' => 'date:Y-m-d',
    ];
}

Step 3 - Create a command

<?php

namespace App\Console\Commands;

use App\Models\DateDimension;
use Carbon\Carbon;
use Carbon\CarbonPeriod;
use Illuminate\Console\Command;
use function ceil;
use function now;

class PopulateDateDimensionsTableCommand extends Command
{
    /**
     * The name and signature of the console command.
     *
     * @var string
     */
    protected $signature = 'app:PopulateDateDimensionsTableCommand';
    
    /**
     * The console command description.
     *
     * @var string
     */
    protected $description = 'Populate data for date dimensions table';
    
    /**
     * Create a new command instance.
     *
     * @return void
     */
    public function __construct()
    {
        parent::__construct();
    }
    
    /**
     * Execute the console command.
     *
     * @return mixed
     */
    public function handle()
    {
        $this->info(now()->toDateTimeString() . " Start: app:PopulateDateDimensionsTableCommand");
        
        // Truncate all records
        DateDimension::truncate();
        
        // Create an empty array and save the transformed input to array
        $dataToInsert = [];
        
        // Get the date range
        // @NOTE - update the start and end date as per your choice
        $dates = CarbonPeriod::create('2015-01-01', '2030-12-31');
        
        // For each dates create a transformed data
        foreach ($dates as $date) {
            
            // Get the quarter details, as ABC has a different quarter system
            // @note - Carbon does not allow to override the quarters
            $quarterDetails = $this->getQuarterDetails($date);
        
            // Main transformer
            $dataToInsert[] = [
                'date' => $date->format('Y-m-d'),
                'day' => $date->day,
                'month' => $date->month,
                'year' => $date->year,
                'day_name' => $date->dayName,
                'day_suffix' => $this->getDaySuffix($date->day),
                'day_of_week' => $date->dayOfWeek,
                'day_of_year' => $date->dayOfYear,
                'is_weekend' => (int) $date->isWeekend(),
                'week' => $date->week,
                'iso_week' => $date->isoWeek,
                'week_of_month' => $date->weekOfMonth,
                'week_of_year' => $date->weekOfYear,
                'iso_week_in_year' => $date->isoWeeksInYear,
                'month_name' => $date->monthName,
                'month_year' => $date->format('mY'),
                'month_name_year' => $date->format('MY'),
                'first_day_of_month' => $date->clone()->firstOfMonth()->format('Y-m-d'),
                'last_day_of_month' => $date->clone()->lastOfMonth()->format('Y-m-d'),
                'first_day_of_next_month' => $date->clone()->addMonthNoOverflow()->firstOfMonth()->format('Y-m-d'),
                'quarter' => $quarterDetails['value'],
                'quarter_name' => $quarterDetails['name'],
                'first_day_of_quarter' => $quarterDetails['first_day_of_quarter'],
                'last_day_of_quarter' => $quarterDetails['last_day_of_quarter'],
                'first_day_of_year' => $date->clone()->firstOfYear()->format('Y-m-d'),
                'last_day_of_year' => $date->clone()->lastOfYear()->format('Y-m-d'),
                'first_day_of_next_year' => $date->clone()->addYear()->firstOfYear()->format('Y-m-d'),
                'dow_in_month' => (int)ceil($date->day/7),
                'created_at' => now(),
                'updated_at' => now(),
            ];
        }
        
        // Create chunks for faster insertion
        // @note - SQL Server supports a maximum of 2100 parameters.
        $chunks = collect($dataToInsert)->chunk(50);
        
        // Using chunks insert the data
        foreach ($chunks as $chunk) {
            DateDimension::insert($chunk->toArray());   
        }
        
        $this->info(now()->toDateTimeString() . " Complete: app:PopulateDateDimensionsTableCommand");
    }
    
    /**
     * Get Quarter details
     * @OTE - Depending on your companies quarter update the map and logic below 
     *
     * @param Carbon $date
     * @return array
     */
    private function getQuarterDetails(Carbon $date)
    {
        $quarterMonthMap = [
            1 => ['value' => 1, 'name' => 'First'],
            2 => ['value' => 2, 'name' => 'Second'],
            3 => ['value' => 2, 'name' => 'Second'],
            4 => ['value' => 2, 'name' => 'Second'],
            5 => ['value' => 3, 'name' => 'Third'],
            6 => ['value' => 3, 'name' => 'Third'],
            7 => ['value' => 3, 'name' => 'Third'],
            8 => ['value' => 4, 'name' => 'Fourth'],
            9 => ['value' => 4, 'name' => 'Fourth'],
            10 => ['value' => 4, 'name' => 'Fourth'],
            11 => ['value' => 1, 'name' => 'First'],
            12 => ['value' => 1, 'name' => 'First'],
        ];
        
        $output['value'] = $quarterMonthMap[$date->month]['value'];
        $output['name'] = $quarterMonthMap[$date->month]['name'];
        
        switch ($output['value']) {
            case 1:
                $output['first_day_of_quarter'] = Carbon::parse($date->year - 1 . '-11-01')->firstOfMonth()->format('Y-m-d');
                $output['last_day_of_quarter'] = Carbon::parse($date->year . '-01-01')->lastOfMonth()->format('Y-m-d');
                
                break;
            case 2:
                $output['first_day_of_quarter'] = Carbon::parse($date->year . '-02-01')->firstOfMonth()->format('Y-m-d');
                $output['last_day_of_quarter'] = Carbon::parse($date->year . '-04-01')->lastOfMonth()->format('Y-m-d');
                
                break;
            case 3:
                $output['first_day_of_quarter'] = Carbon::parse($date->year . '-05-01')->firstOfMonth()->format('Y-m-d');
                $output['last_day_of_quarter'] = Carbon::parse($date->year . '-07-01')->lastOfMonth()->format('Y-m-d');
                
                break;
            case 4:
                $output['first_day_of_quarter'] = Carbon::parse($date->year . '-08-01')->firstOfMonth()->format('Y-m-d');
                $output['last_day_of_quarter'] = Carbon::parse($date->year . '-10-01')->lastOfMonth()->format('Y-m-d');
                
                break;
        }
        
        return $output;
    }
    
    /**
     * Get the Day Suffix
     * Copied logic from - https://www.mssqltips.com/sqlservertip/4054/creating-a-date-dimension-or-calendar-table-in-sql-server/
     *
     * @param $day
     * @return string
     */
    private function getDaySuffix($day)
    {
        if ($day/10 == 1) {
            return "th";
        }
        $right = substr($day, -1);
        
        if ($right == 1) {
            return 'st';
        }
        
        if ($right == 2) {
            return 'nd';
        }
        
        if ($right == 3) {
            return 'rd';
        }
        
        return 'th';
    }
}

In this command, update the start date and end date (line 55) as per your requirement. Also, the quarter in this script run from November-January, February- April, May-July, August-October, this might be different for your use case, so it should be updated accordingly.

Conclusion

In this article you were able to design a date dimension table using Laravel. The migration, commands, etc. are database driver independent. So you can essentially create this table in MS SQL or MySQL or any other database supported by Laravel.

References

6 min read

TESTING, BLOG

How to add PSR-2 code check as unit test

Who is this article for?

This article is for developers who are looking to add PSR-2 code check to their project.

Before You Begin

I am going to assume you know what PSR-2 coding style entails. If you are not fully aware of what PSR-2 is, please take time and read about it.

Below are some good reads -

Any fool can write code that a computer can understand. Good programmers write code that humans can understand.
Refactoring: Improving the Design of Existing Code, 1999. Martin Fowler.

Now that you have a basic understanding of what PSR-2 is, let's jump into implementing it. We are going to implement it as a Unit test called CodeStyleLinterTest.

Step 1 - Install Required Packages

PHP-CS-Fixer

This tool will verify your coding standards as defined in the PSR-1, PSR-2, etc., or other community-driven ones like the Symfony one. You can also define your (team's) style through configuration. You can read all about the package here - https://github.com/FriendsOfPHP/PHP-CS-Fixer. It even allows you to create custom rule-based on your requirements. Ok, so let's install it now to our PHP project

composer require friendsofphp/php-cs-fixer --dev

The Process Component

The Process component executes commands in sub-processes. You can also use exec() but for security reason, I would recommend using the process component package.

composer require symfony/process  --dev

As I am going to use these two packages only for my development purposes only, I have added them as dev dependencies of my composer.json.

Step 2 - The Unit Test Case

Script

<?php

namespace Tests\Unit;

use Symfony\Component\Process\Process;
use function base_path;
use Tests\TestCase;

class CodeStyleLinterTest extends TestCase
{
    // Add more path to test
    const PATH_TO_TEST = [
        'app',
        'config',
        'public/index.php',
        'tests',
        'database'
    ];
    
    /** @test */
    public function psr2()
    {
        $phpCsFixerPath = base_path('vendor/bin/php-cs-fixer');
        // Let's check PSR-2 compliance for our code
        foreach (self::PATH_TO_TEST as $path) {
            $fullPath = base_path($path);
            // Run linter in dry-run mode so it changes nothing.
            $process = new Process([$phpCsFixerPath, 'fix', $fullPath, '--dry-run', '--allow-risky=no']);
            $process->run();
            // Exit code should be 0, else there is a problem with the PSR-2 compliance
            $this->assertEquals(
                0,
                $process->getExitCode(),
                $process->getOutput()
            );
        }
    }
}

I created this unit test for my laravel app, but you can use it for any PHP project. You can update the PATH_TO_TEST constant to include your directories or files you want to run the PSR-2 code check.

Conclusion

Having a simple PSR-2 does not ensure code quality; it only makes the code look pretty i.e., formatting. You should conduct regular code review session to go over your code.

Sources

2 min read

MYSQL, ACCESS, DATABASE, BLOG

How to effectively import large MS Access DB to MySQL DB

Who is this article for?

Anyone looking for how to import large MS Access DB to MySQL DB effectively.

Before You Begin

There are several options which we can use to import the data to MySQL. Few of those are listed in these articles below -

https://stackoverflow.com/questions/5722544/how-can-i-convert-an-mdb-access-file-to-mysql-or-plain-sql-file

https://stackoverflow.com/questions/4809654/how-to-import-an-access-mdb-format-database-to-mysql

https://stackoverflow.com/questions/5465661/converting-mysql-to-ms-access

These options work well if the data that needs to be imported is small but is very slow for an extensive database. The reason the import is very slow is that it creates one insert statement per record. I recently had to convert a relatively large MS Access DB file to MySQL. The import process was prolonged, and I let it run overnight, but it didn't complete. So I had to come up with a different solution which can import the data much faster.

For this post, we can use the data provided by IPEDS. The file can be downloaded from here - https://nces.ed.gov/ipeds/use-the-data/download-access-database. We will use the 2017-18 Access file, which is the latest release at the time of writing this article.

Also, we are going to use mdbtools. I believe you can install it on Windows Operating system too - https://github.com/brianb/mdbtools/issues/107. Since I run a Linux distribution installing it was an easy setup.

Script

#!/bin/bash
#===========================================================
# FILE        : script.sh
# DESCRIPTION : Convert MS Access to MySQL.
#===========================================================
set -o nounset # help avoid bugs
shopt -s extglob
PATH=/bin:/usr/bin:/sbin # for consistency

# variables declaration
DB_NAME=ENTER_YOUR_DB_NAME
DB_UN=ENTER_YOUR_DB_USER
PATH_TO_ACCESS_FILE=ENTER_THE_ABSOLUTE_PATH_TO_ACCESS_FILE

# set login path to suppress warnings
mysql_config_editor remove --login-path=local
mysql_config_editor set --login-path=local --host=localhost --user=$DB_UN --password

# get all the tables
TABLES=$(mdb-tables -1 $PATH_TO_ACCESS_FILE)

# drop tables if exits
echo "$(date +%Y%m%d_%H%M) DROPPING TABLE IF EXISTS"
for t in $TABLES
do
    mysql --login-path=local $DB_NAME -Bse "DROP TABLE IF EXISTS $t;"
done

# create meta definition for the tables
rm -rf meta.sql
mdb-schema $PATH_TO_ACCESS_FILE mysql > meta.sql

# create the tables using the meta.sql file generated above
echo "$(date +%Y%m%d_%H%M) CREATING TABLES;"
mysql --login-path=local $DB_NAME < meta.sql
rm -rf meta.sql

echo "$(date +%Y%m%d_%H%M) IMPORTING DATA INTO TABLES;"
for t in $TABLES
do
    echo "$(date +%Y%m%d_%H%M) IMPORTING DATA FOR $t;"
    rm -rf $t.csv
    mdb-export -D '%Y-%m-%d %H:%M:%S' $PATH_TO_ACCESS_FILE $t > $t.csv
    mysqlimport --login-path=local --ignore-lines=1 --fields-terminated-by=, --fields-optionally-enclosed-by='"' --local $DB_NAME $t.csv
    mysql --login-path=local $DB_NAME -Bse "ALTER TABLE $t CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;"
    rm -rf $t.csv
done

echo "$(date +%Y%m%d_%H%M) COMPLETED;"

Conclusion

When I used this script, I was able to import the data in less than a few minutes. A general outline of how the script works.

  1. Creates login path to suppress warnings.
  2. Drops tables if they exist from the specified database.
  3. Recreate the table metadata from MS Access.
  4. Export each table to .csv file.
  5. Bulk import the .csv data to the specified table.

Limitations

  • The above script is tested on a Linux distribution.
  • mdbtools does not create indexes or foreign key references.

Sources

3 min read

JSON, PERFORMANCE, PHP, BLOG

PHP - Performance - Adjacency set to nested JSON

Before You Begin

In this article, we will compare the performance of Method 1 and Method 2 as used in - PHP – Convert Adjacency set to nested JSON.

We will be utilizing an extensive data set with approximately 500K records. I have downloaded these records from itis.gov and made some minor modification for this article. Data can be downloaded from here - https://drive.google.com/open?id=115_d5PV4OQ948cWP5TLXUpN2TvLvvo_y

We will be using MySQL CTE and break the data into chunks to test the performance for different size of records. You can refer to the CTE article here - MySQL – Adjacency List Model For Hierarchical Data Using CTE.

There is a slight change made to getDataFromDatabase() method as we want to limit the records to test the performance. Below you can find the updated definition. I have added the placeholder inside the CTE as [ID_OF_ENTITY]. This will be our starting point and then I will retrieve all descendants for this entity and convert it to a nested response.

Method 1

<?php

// Initialize php setting - @node - NOT RECOMMENDED FOR PRODUCTION, update your php.ini if needed.
error_reporting(E_ALL);
ini_set('display_errors', 1);
ini_set('memory_limit', '-1');
ini_set('max_execution_time', 30000);

// Get the tree array from DB
$treeArray = getDataFromDatabase();

// Transform the data
$outputTree = transformTree($treeArray, [ID_OF_ENTITY]);

// Output the response
header('Content-Type: application/json');
echo json_encode($outputTree[0]);

/**
 * Transform the tree
 *
 * @param $treeArray
 * @param null $parentId
 * @return array
 */
function transformTree($treeArray, $parentId = null)
{
    $output = [];

    // Read through all nodes of the tree
    foreach ($treeArray as $node) {
        
    // If the node parent is same as parent passed in argument
        if ($node['parent_id'] == $parentId) {
            
            // Get all the children for that node, using recursive method
            $children = transformTree($treeArray, $node['id']);
            
            // If children are found, add it to the node children array
            if ($children) {
                $node['children'] = $children;
            }
            
            // Add the main node with/without children to the main output
            $output[] = $node;
            
            // Remove the node from main array to avoid duplicate reading, speed up the process
            unset($node);
        }
    }
    return $output;
}

/**
 * Get all records from DB and return array
 *
 * @return array
 */
function getDataFromDatabase()
{
    // Create database connection
    $dbConnection = new mysqli("localhost", "root", "secret", "adjacency");
    
    // Get the result from DB Table
    $records = $dbConnection->query("
        WITH RECURSIVE tree AS
        (
          SELECT id, name, parent_id
            FROM entities
            WHERE id = [ID_OF_ENTITY]
          UNION ALL
          SELECT e.id, e.name, e.parent_id
            FROM tree AS t
              JOIN entities AS e ON t.id = e.parent_id
        )
        SELECT id, name, parent_id FROM tree;
    ");
    
    // Fetch all records
    // @MYSQLI_ASSOC - Columns are returned into the array having the field name as the array index.
    $output = mysqli_fetch_all($records, MYSQLI_ASSOC);
    
    // Close the connection
    $dbConnection->close();
    
    return $output;
}

Method 2

<?php

// Initialize php setting - @node - NOT RECOMMENDED FOR PRODUCTION, update your php.ini if needed.
error_reporting(E_ALL);
ini_set('display_errors', 1);
ini_set('memory_limit', '-1');
ini_set('max_execution_time', 30000);

// Get the tree array from DB
$treeArray = getDataFromDatabase();

// Group by parent id
$treeArrayGroups = [];

foreach ($treeArray as $record) {
    $treeArrayGroups[$record['parent_id']][] = $record;
}

// Get the root
$rootArray = $treeArray[0];

// Transform the data
$outputTree = transformTree($treeArrayGroups, $rootArray);

// Output the response
header('Content-Type: application/json');
echo json_encode($outputTree);

/**
 * Transform the tree
 *
 * @param $treeArrayGroups
 * @param $rootArray
 * @return mixed
 */
function transformTree($treeArrayGroups, $rootArray)
{
    // Read through all nodes where parent is root array
    foreach ($treeArrayGroups[$rootArray['id']] as $child) {
        
        // If there is a group for that child, aka the child has children
        if (isset($treeArrayGroups[$child['id']])) {
            // Traverse into the child
            $newChild = transformTree($treeArrayGroups, $child);
        } else {
            $newChild = $child;
        }
        
        // Assign the child to the array of children in the root node
        $rootArray['children'][] = $newChild;
    }
    return $rootArray;
}

/**
 * Get all records from DB and return array
 *
 * @return array
 */
function getDataFromDatabase()
{
    // Create database connection
    $dbConnection = new mysqli("localhost", "root", "secret", "adjacency");
    
    // Get the result from DB Table
    $records = $dbConnection->query("
        WITH RECURSIVE tree AS
        (
          SELECT id, name, parent_id
            FROM entities
            WHERE id = [ID_OF_ENTITY]
          UNION ALL
          SELECT e.id, e.name, e.parent_id
            FROM tree AS t
              JOIN entities AS e ON t.id = e.parent_id
        )
        SELECT id, name, parent_id FROM tree;
    ");
    
    // Fetch all records
    // @MYSQLI_ASSOC - Columns are returned into the array having the field name as the array index.
    $output = mysqli_fetch_all($records, MYSQLI_ASSOC);
    
    // Close the connection
    $dbConnection->close();
    
    return $output;
}

Performance Comparision

For performance comparison, I am going to use blackfire.io. We will be ignoring the performance of all other components like MySQL except for transformTree().

Results

ID_OF_ENTITY = 846539, ~5,000 DESCENDANTS

# CALLS FOR transformTree() EXECUTION TIME I/O WAIT TIME CPU PEAK MEMORY
Method 1 - Result 4,960 1.58 s 58.9 ms 1.52 s 3.88 MB
Method 2 - Result 847 88 ms 79.4 ms 8.65 ms 4.41 MB

ID_OF_ENTITY = 846542, ~10,000 DESCENDANTS

# CALLS FOR transformTree() EXECUTION TIME I/O WAIT TIME CPU PEAK MEMORY
Method 1 - Result 12,014 21.3 s 40 ms 21.2 s 8.28 MB
Method 2 - Result 1,807 185 ms 158 ms 27.4 ms 10.3 MB

ID_OF_ENTITY = 846535, ~20,000 DESCENDANTS

# CALLS FOR transformTree() EXECUTION TIME I/O WAIT TIME CPU PEAK MEMORY
Method 1 - Result 19,059 1 min 5 s 82.3 ms 1 min 5 s 13.6 MB
Method 2 - Result 3,061 179 ms 122 ms 57.6 ms 16.8 MB

ID_OF_ENTITY = 846504, ~50,000 DESCENDANTS

# CALLS FOR transformTree() EXECUTION TIME I/O WAIT TIME CPU PEAK MEMORY
Method 1 - Result 54,703 11 min 33 s 1.03 s 11 min 32 s 37.4 MB
Method 2 - Result 8,589 534 ms 404 ms 129 ms 47.1 MB

Conclusion

As you can see from the above result set, Method 2 is significantly faster than Method 1 when it comes to big data set.

I tried to run Method 1 on the complete data set with ~ 500K records, and I waited for more than an hr, but it never executed. On the flip side - Method 2 was still able to complete within 10 seconds.

The main reason for Method 2 being superior is because of the recursive method called significantly less than Method 1.

In future articles, I will talk about how you can best mock any third party service while running PHPUnit tests inside Laravel. As making a real connection slows down the complete suite.

4 min read

JSON, PHP, BLOG

PHP - Convert Adjacency set to nested JSON

Who is this article for?

Anyone looking for how to convert an adjacency set to a nested JSON response. Different types of data can be stored as adjacency set e.x. Menu items, any kind of hierarchy, etc.

In this article, I’m going to walk through on a couple of algorithms written in PHP.

Before You Begin

We will be utilizing the data definition and structure from this article - MySQL – Adjacency List Model For Hierarchical Data Using CTE. You can refer to this post to create the MySQL Table and populate data.

Method 1

<?php

// Get the tree array from DB
$treeArray = getDataFromDatabase();

// Transform the data
$outputTree = transformTree($treeArray);

// Output the response
header('Content-Type: application/json');
echo json_encode($outputTree[0]);

/**
 * Transform the tree
 *
 * @param $treeArray
 * @param null $parentId
 * @return array
 */
function transformTree($treeArray, $parentId = null)
{
    $output = [];

    // Read through all nodes of the tree
    foreach ($treeArray as $node) {

        // If the node parent is same as parent passed in argument
        if ($node['parent_id'] == $parentId) {

            // Get all the children for that node, using recursive method
            $children = transformTree($treeArray, $node['id']);

            // If children are found, add it to the node children array
            if ($children) {
                $node['children'] = $children;
            }

            // Add the main node with/without children to the main output
            $output[] = $node;

            // Remove the node from main array to avoid duplicate reading, speed up the process
            unset($node);
        }
    }
    return $output;
}

/**
 * Get all records from DB and return array
 *
 * @return array
 */
function getDataFromDatabase()
{
    // Create database connection
    $dbConnection = new mysqli("localhost", "root", "secret", "adjacency");

    // Get the result from DB Table
    $records = $dbConnection->query("SELECT id, parent_id, name FROM category ORDER BY parent_id");

    // Fetch all records
    // @MYSQLI_ASSOC - Columns are returned into the array having the field name as the array index.
    $output = mysqli_fetch_all($records, MYSQLI_ASSOC);

    // Close the connection
    $dbConnection->close();

    return $output;
}

Method 2

<?php

// Get the tree array from DB
$treeArray = getDataFromDatabase();

// Group by parent id
$treeArrayGroups = [];
foreach ($treeArray as $record) {
    $treeArrayGroups[$record['parent_id']][] = $record;
}

// Get the root
$rootArray = $treeArray[0];

// Transform the data
$outputTree = transformTree($treeArrayGroups, $rootArray);

// Output the response
header('Content-Type: application/json');
echo json_encode($outputTree);

/**
 * Transform the tree
 *
 * @param $treeArrayGroups
 * @param $rootArray
 * @return mixed
 */
function transformTree($treeArrayGroups, $rootArray)
{
    // Read through all nodes where parent is root array
    foreach ($treeArrayGroups[$rootArray['id']] as $child) {
        
    // If there is a group for that child, aka the child has children
        if (isset($treeArrayGroups[$child['id']])) {
            // Traverse into the child
            $newChild = transformTree($treeArrayGroups, $child);
        } else {
            $newChild = $child;
        }

        // Assign the child to the array of children in the root node
        $rootArray['children'][] = $newChild;
    }

    return $rootArray;
}

/**
 * Get all records from DB and return array
 *
 * @return array
 */
function getDataFromDatabase()
{
    // Create database connection
    $dbConnection = new mysqli("localhost", "root", "secret", "adjacency");

    // Get the result from DB Table
    $records = $dbConnection->query("SELECT id, parent_id, name FROM category ORDER BY parent_id");

    // Fetch all records
    // @MYSQLI_ASSOC - Columns are returned into the array having the field name as the array index.
    $output = mysqli_fetch_all($records, MYSQLI_ASSOC);

    // Close the connection
    $dbConnection->close();
    return $output;
}

Output

{
  "id": "1",
  "parent_id": null,
  "name": "Electronics",
  "children": [
    {
      "id": "2",
      "parent_id": "1",
      "name": "TV",
      "children": [
        {
          "id": "3",
          "parent_id": "2",
          "name": "Smart"
        },
        {
          "id": "4",
          "parent_id": "2",
          "name": "4K Ultra HD"
        },
        {
          "id": "5",
          "parent_id": "2",
          "name": "Curved"
        }
      ]
    },
    {
      "id": "6",
      "parent_id": "1",
      "name": "Camera"
    },
    {
      "id": "7",
      "parent_id": "1",
      "name": "Computer",
      "children": [
        {
          "id": "8",
          "parent_id": "7",
          "name": "Desktop"
        },
        {
          "id": "9",
          "parent_id": "7",
          "name": "Laptops",
          "children": [
            {
              "id": "10",
              "parent_id": "9",
              "name": "Work"
            },
            {
              "id": "11",
              "parent_id": "9",
              "name": "Travel"
            },
            {
              "id": "12",
              "parent_id": "9",
              "name": "All Around"
            },
            {
              "id": "13",
              "parent_id": "9",
              "name": "Gaming"
            }
          ]
        },
        {
          "id": "14",
          "parent_id": "7",
          "name": "Tablet"
        }
      ]
    }
  ]
}

Performance Comparision

For performance comparison, I am going to use blackfire.io. We will be ignoring the performance of all other components like MySQL except for transformTree(). To get more accurate results, I am going to use several sample size, in this case, I used 10.

Method 1

Method 2

Basic Result

# EXECUTION TIME I/O WAIT TIME CPU PEAK MEMORY
Method 1 734 µs 376 µs 358 µs 44.2 kB
Method 2 762 µs 392 µs 371 µs 44.8 kB

Recursive Calls for transformTree()

# TOTAL DB RECORDS CALLS FOR transformTree()
Method 1 14 15
Method 2 14 4

Full Result of Method 1

Full Result of Method 2

Conclusion

As you can see both Method 1 and Method 2 generates the same output. These can be used with any type of dataset. It performs recursive operations and produces output for any depth of data.

Although, we don't really see the benefit of using Method 2 over Method 1 as the data set used is very small. The difference in performance is almost similar for small dataset.

But, I would strongly recommend using Method 2, as this method is far more superior and faster.

In my next article, we will dive deep into comparing the performance of these two methods.

3 min read