From the blog
How-Tos, Deep-Dives, Brain-Dumps, and More
GITHUB, ACTIONS, PHP, BLOG
Fix the style of your code using Github Actions
In this article, we will leverage Github Actions to fix our PHP, JS, and CSS code style.
For most of my projects, I like to use styleci.io. Recently I was working on a project where I cannot use StyleCI because of compliance. Since our project was hosted on Github, we created Github workflows to replicate the functionality. Our project had two separate repositories, one for back-end API and one for our SPA frontend. So, we will be using two different Github workflows to style our codebase. But you can always combine it into one workflow depending on your requirement.
I have created a Demo Github repository that contains a working example of this concept. Let's discuss those two workflows.
Style PHP
This workflow fixes your PHP code to follow standards. We are using PHP-CS-Fixer
for the same.
We will need to do three things.
- Install the PHP-CS-Fixer composer package.
- Configure our Github workflow.
- Configure our PHP-CS-Fixer rules set
Step 1
Let's start by adding the FriendsOfPHP/PHP-CS-Fixer
composer package to our codebase.
composer require friendsofphp/php-cs-fixer --dev
Step 2
Create a Github workflow inside the .github/workflows
directory. I named my workflow as style-php-cs-fixer.yml
.
name: Style PHP-CS-Fixer
on:
push:
branches: [ master ]
pull_request:
branches: [ master ]
jobs:
style:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
with:
ref: ${{ github.head_ref }}
- name: Install composer packages
run: composer install
- name: Run PHP-CS-Fixer Style
run: ./vendor/bin/php-cs-fixer fix
- uses: stefanzweifel/git-auto-commit-action@v4
with:
commit_message: Apply PHP-CS-Fixer changes
Step 3
Create a .php_cs
file with the Rule Sets. You can view the list of available rules here.
I used the rule set that I use with most of my PHP/Laravel projects.
<?php
use PhpCsFixer\Config;
use PhpCsFixer\Finder;
$rules = [
'array_syntax' => ['syntax' => 'short'],
'binary_operator_spaces' => [
'default' => 'single_space',
'operators' => ['=>' => null]
],
'blank_line_after_namespace' => true,
'blank_line_after_opening_tag' => true,
'blank_line_before_statement' => [
'statements' => ['return']
],
'braces' => true,
'cast_spaces' => true,
'class_attributes_separation' => [
'elements' => ['method']
],
'class_definition' => true,
'concat_space' => [
'spacing' => 'none'
],
'declare_equal_normalize' => true,
'elseif' => true,
'encoding' => true,
'full_opening_tag' => true,
'fully_qualified_strict_types' => true, // added by Shift
'function_declaration' => true,
'function_typehint_space' => true,
'heredoc_to_nowdoc' => true,
'include' => true,
'increment_style' => ['style' => 'post'],
'indentation_type' => true,
'linebreak_after_opening_tag' => true,
'line_ending' => true,
'lowercase_cast' => true,
'lowercase_constants' => true,
'lowercase_keywords' => true,
'lowercase_static_reference' => true, // added from Symfony
'magic_method_casing' => true, // added from Symfony
'magic_constant_casing' => true,
'method_argument_space' => true,
'native_function_casing' => true,
'no_alias_functions' => true,
'no_extra_blank_lines' => [
'tokens' => [
'extra',
'throw',
'use',
'use_trait',
]
],
'no_blank_lines_after_class_opening' => true,
'no_blank_lines_after_phpdoc' => true,
'no_closing_tag' => true,
'no_empty_phpdoc' => true,
'no_empty_statement' => true,
'no_leading_import_slash' => true,
'no_leading_namespace_whitespace' => true,
'no_mixed_echo_print' => [
'use' => 'echo'
],
'no_multiline_whitespace_around_double_arrow' => true,
'multiline_whitespace_before_semicolons' => [
'strategy' => 'no_multi_line'
],
'no_short_bool_cast' => true,
'no_singleline_whitespace_before_semicolons' => true,
'no_spaces_after_function_name' => true,
'no_spaces_around_offset' => true,
'no_spaces_inside_parenthesis' => true,
'no_trailing_comma_in_list_call' => true,
'no_trailing_comma_in_singleline_array' => true,
'no_trailing_whitespace' => true,
'no_trailing_whitespace_in_comment' => true,
'no_unneeded_control_parentheses' => true,
'no_unreachable_default_argument_value' => true,
'no_useless_return' => true,
'no_whitespace_before_comma_in_array' => true,
'no_whitespace_in_blank_line' => true,
'normalize_index_brace' => true,
'not_operator_with_successor_space' => true,
'object_operator_without_whitespace' => true,
'ordered_imports' => ['sortAlgorithm' => 'alpha'],
'phpdoc_indent' => true,
'phpdoc_inline_tag' => true,
'phpdoc_no_access' => true,
'phpdoc_no_package' => true,
'phpdoc_no_useless_inheritdoc' => true,
'phpdoc_scalar' => true,
'phpdoc_single_line_var_spacing' => true,
'phpdoc_summary' => true,
'phpdoc_to_comment' => true,
'phpdoc_trim' => true,
'phpdoc_types' => true,
'phpdoc_var_without_name' => true,
'psr4' => true,
'self_accessor' => true,
'short_scalar_cast' => true,
'simplified_null_return' => false, // disabled by Shift
'single_blank_line_at_eof' => true,
'single_blank_line_before_namespace' => true,
'single_class_element_per_statement' => true,
'single_import_per_statement' => true,
'single_line_after_imports' => true,
'single_line_comment_style' => [
'comment_types' => ['hash']
],
'single_quote' => true,
'space_after_semicolon' => true,
'standardize_not_equals' => true,
'switch_case_semicolon_to_colon' => true,
'switch_case_space' => true,
'ternary_operator_spaces' => true,
'trailing_comma_in_multiline_array' => true,
'trim_array_spaces' => true,
'unary_operator_spaces' => true,
'visibility_required' => [
'elements' => ['method', 'property']
],
'whitespace_after_comma_in_array' => true,
];
$project_path = getcwd();
$finder = Finder::create()
->in([
$project_path
])
->name('*.php')
->notName('*.blade.php')
->ignoreDotFiles(true)
->ignoreVCS(true);
return Config::create()
->setFinder($finder)
->setRules($rules)
->setRiskyAllowed(true)
->setUsingCache(true);
You will need to modify the path on line number 129 to match your codebase. Here is an example rule sets that laravel shit uses.
Now our PHP Style is all set.
Style JS
This workflow fixes your JS, CSS, and Vue code to follow standards. We are using Prettier
for the same.
We will need to do three things. Install the Prettier package. Configure our Github workflow. Configure our Prettier configuration file
Step 1
Let's start by adding the Prettier
package to our codebase.
yarn add --dev --exact prettier
Step 2
Create a Github workflow inside the .github/workflows
directory. I named my workflow as style-prettier.yml
.
name: Style Prettier
on:
push:
branches: [ master ]
pull_request:
branches: [ master ]
jobs:
style:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
with:
ref: ${{ github.head_ref }}
- name: Install node packages
run: yarn
- name: Run fixer
run: yarn prettier --write *.{css,js,vue}
- uses: stefanzweifel/git-auto-commit-action@v4
with:
commit_message: Apply prettier changes
Step 3
Create a .prettierrc
configuration file.
{
"printWidth": 100,
"singleQuote": true,
"tabWidth": 4,
"trailingComma": "es5"
}
Result
To verify that our workflows perform the desired actions. I have created a pull request will minor changes to the PHP, JS, CSS, and VUE files.
As you can see, our workflow corrected our code style and added two new commits on our pull request.
You can review what files were changed in each commit.
Links
DATABASE, PHP, LARAVEL, BLOG
Laravel - Creating a date dimension (calendar) table
What is a date dimension table
Date Dimension is a table that has one record per each day, no more, no less! Date dimension plays a vital role in your data warehouse designing, it provides the ability to study behavior and trend of your data over a period of time.
Most BI tools have an in-built date dimension table, but sometimes you might want to create one. Date dimension table is only created and loaded once since the data is not changed that often.
This article is inspired by the article "Creating a date dimension or calendar table in SQL Server". The article is specific for MS SQL database, but in our example, we'll create one using Laravel and Carbon so it can work with any kind of database driver.
We will be creating the following files with Laravel.
- Migration - To create the table.
- Model - To reference the table.
- Console Command - To populate the table.
Step 1 - Create the migration
The columns in this table are self-explanatory, you can also google them if you want to learn more.
<?php
use Illuminate\Support\Facades\Schema;
use Illuminate\Database\Schema\Blueprint;
use Illuminate\Database\Migrations\Migration;
class CreateDateDimensionsTable extends Migration
{
/**
* Run the migrations.
*
* @return void
*/
public function up()
{
Schema::create('date_dimensions', function (Blueprint $table) {
$table->date('date')->primary();
$table->unsignedInteger('day');
$table->unsignedInteger('month');
$table->unsignedInteger('year');
$table->string('day_name');
$table->string('day_suffix', 2);
$table->unsignedInteger('day_of_week');
$table->unsignedInteger('day_of_year');
$table->unsignedInteger('is_weekend');
$table->unsignedInteger('week');
$table->unsignedInteger('iso_week');
$table->unsignedInteger('week_of_month');
$table->unsignedInteger('week_of_year');
$table->unsignedInteger('iso_week_in_year');
$table->string('month_name');
$table->string('month_year');
$table->string('month_name_year');
$table->date('first_day_of_month');
$table->date('last_day_of_month');
$table->date('first_day_of_next_month');
$table->unsignedInteger('quarter');
$table->string('quarter_name');
$table->date('first_day_of_quarter');
$table->date('last_day_of_quarter');
$table->date('first_day_of_year');
$table->date('last_day_of_year');
$table->date('first_day_of_next_year');
$table->unsignedInteger('dow_in_month');
$table->timestamps();
});
}
/**
* Reverse the migrations.
*
* @return void
*/
public function down()
{
Schema::dropIfExists('date_dimensions');
}
}
Step 2 - Create the model
In the model we are also typecasting the attributes, so it sets/gets the value in the correct format.
<?php
namespace App\Models;
use Illuminate\Database\Eloquent\Model;
use Illuminate\Database\Eloquent\SoftDeletes;
class DateDimension extends Model
{
/**
* Indicates if the IDs are auto-incrementing.
*
* @var bool
*/
public $incrementing = false;
/**
* The "type" of the primary key ID.
*
* @var string
*/
protected $keyType = 'date';
/**
* The primary key associated with the table.
*
* @var string
*/
protected $primaryKey = 'date';
/**
* The attributes that are mass assignable.
*
* @var array
*/
protected $fillable = [
'date',
'day',
'month',
'year',
'day_name',
'day_suffix',
'day_of_week',
'day_of_year',
'is_weekend',
'week',
'iso_week',
'week_of_month',
'week_of_year',
'iso_week_in_year',
'month_name',
'month_year',
'month_name_year',
'first_day_of_month',
'last_day_of_month',
'first_day_of_next_month',
'quarter',
'quarter_name',
'first_day_of_quarter',
'last_day_of_quarter',
'first_day_of_year',
'last_day_of_year',
'first_day_of_next_year',
'dow_in_month',
];
/**
* The attributes that should be cast to native types.
*
* @var array
*/
protected $casts = [
'date' => 'date:Y-m-d',
'first_day_of_month' => 'date:Y-m-d',
'last_day_of_month' => 'date:Y-m-d',
'first_day_of_next_month' => 'date:Y-m-d',
'first_day_of_quarter' => 'date:Y-m-d',
'last_day_of_quarter' => 'date:Y-m-d',
'first_day_of_year' => 'date:Y-m-d',
'last_day_of_year' => 'date:Y-m-d',
'first_day_of_next_year' => 'date:Y-m-d',
];
}
Step 3 - Create a command
<?php
namespace App\Console\Commands;
use App\Models\DateDimension;
use Carbon\Carbon;
use Carbon\CarbonPeriod;
use Illuminate\Console\Command;
use function ceil;
use function now;
class PopulateDateDimensionsTableCommand extends Command
{
/**
* The name and signature of the console command.
*
* @var string
*/
protected $signature = 'app:PopulateDateDimensionsTableCommand';
/**
* The console command description.
*
* @var string
*/
protected $description = 'Populate data for date dimensions table';
/**
* Create a new command instance.
*
* @return void
*/
public function __construct()
{
parent::__construct();
}
/**
* Execute the console command.
*
* @return mixed
*/
public function handle()
{
$this->info(now()->toDateTimeString() . " Start: app:PopulateDateDimensionsTableCommand");
// Truncate all records
DateDimension::truncate();
// Create an empty array and save the transformed input to array
$dataToInsert = [];
// Get the date range
// @NOTE - update the start and end date as per your choice
$dates = CarbonPeriod::create('2015-01-01', '2030-12-31');
// For each dates create a transformed data
foreach ($dates as $date) {
// Get the quarter details, as ABC has a different quarter system
// @note - Carbon does not allow to override the quarters
$quarterDetails = $this->getQuarterDetails($date);
// Main transformer
$dataToInsert[] = [
'date' => $date->format('Y-m-d'),
'day' => $date->day,
'month' => $date->month,
'year' => $date->year,
'day_name' => $date->dayName,
'day_suffix' => $this->getDaySuffix($date->day),
'day_of_week' => $date->dayOfWeek,
'day_of_year' => $date->dayOfYear,
'is_weekend' => (int) $date->isWeekend(),
'week' => $date->week,
'iso_week' => $date->isoWeek,
'week_of_month' => $date->weekOfMonth,
'week_of_year' => $date->weekOfYear,
'iso_week_in_year' => $date->isoWeeksInYear,
'month_name' => $date->monthName,
'month_year' => $date->format('mY'),
'month_name_year' => $date->format('MY'),
'first_day_of_month' => $date->clone()->firstOfMonth()->format('Y-m-d'),
'last_day_of_month' => $date->clone()->lastOfMonth()->format('Y-m-d'),
'first_day_of_next_month' => $date->clone()->addMonthNoOverflow()->firstOfMonth()->format('Y-m-d'),
'quarter' => $quarterDetails['value'],
'quarter_name' => $quarterDetails['name'],
'first_day_of_quarter' => $quarterDetails['first_day_of_quarter'],
'last_day_of_quarter' => $quarterDetails['last_day_of_quarter'],
'first_day_of_year' => $date->clone()->firstOfYear()->format('Y-m-d'),
'last_day_of_year' => $date->clone()->lastOfYear()->format('Y-m-d'),
'first_day_of_next_year' => $date->clone()->addYear()->firstOfYear()->format('Y-m-d'),
'dow_in_month' => (int)ceil($date->day/7),
'created_at' => now(),
'updated_at' => now(),
];
}
// Create chunks for faster insertion
// @note - SQL Server supports a maximum of 2100 parameters.
$chunks = collect($dataToInsert)->chunk(50);
// Using chunks insert the data
foreach ($chunks as $chunk) {
DateDimension::insert($chunk->toArray());
}
$this->info(now()->toDateTimeString() . " Complete: app:PopulateDateDimensionsTableCommand");
}
/**
* Get Quarter details
* @OTE - Depending on your companies quarter update the map and logic below
*
* @param Carbon $date
* @return array
*/
private function getQuarterDetails(Carbon $date)
{
$quarterMonthMap = [
1 => ['value' => 1, 'name' => 'First'],
2 => ['value' => 2, 'name' => 'Second'],
3 => ['value' => 2, 'name' => 'Second'],
4 => ['value' => 2, 'name' => 'Second'],
5 => ['value' => 3, 'name' => 'Third'],
6 => ['value' => 3, 'name' => 'Third'],
7 => ['value' => 3, 'name' => 'Third'],
8 => ['value' => 4, 'name' => 'Fourth'],
9 => ['value' => 4, 'name' => 'Fourth'],
10 => ['value' => 4, 'name' => 'Fourth'],
11 => ['value' => 1, 'name' => 'First'],
12 => ['value' => 1, 'name' => 'First'],
];
$output['value'] = $quarterMonthMap[$date->month]['value'];
$output['name'] = $quarterMonthMap[$date->month]['name'];
switch ($output['value']) {
case 1:
$output['first_day_of_quarter'] = Carbon::parse($date->year - 1 . '-11-01')->firstOfMonth()->format('Y-m-d');
$output['last_day_of_quarter'] = Carbon::parse($date->year . '-01-01')->lastOfMonth()->format('Y-m-d');
break;
case 2:
$output['first_day_of_quarter'] = Carbon::parse($date->year . '-02-01')->firstOfMonth()->format('Y-m-d');
$output['last_day_of_quarter'] = Carbon::parse($date->year . '-04-01')->lastOfMonth()->format('Y-m-d');
break;
case 3:
$output['first_day_of_quarter'] = Carbon::parse($date->year . '-05-01')->firstOfMonth()->format('Y-m-d');
$output['last_day_of_quarter'] = Carbon::parse($date->year . '-07-01')->lastOfMonth()->format('Y-m-d');
break;
case 4:
$output['first_day_of_quarter'] = Carbon::parse($date->year . '-08-01')->firstOfMonth()->format('Y-m-d');
$output['last_day_of_quarter'] = Carbon::parse($date->year . '-10-01')->lastOfMonth()->format('Y-m-d');
break;
}
return $output;
}
/**
* Get the Day Suffix
* Copied logic from - https://www.mssqltips.com/sqlservertip/4054/creating-a-date-dimension-or-calendar-table-in-sql-server/
*
* @param $day
* @return string
*/
private function getDaySuffix($day)
{
if ($day/10 == 1) {
return "th";
}
$right = substr($day, -1);
if ($right == 1) {
return 'st';
}
if ($right == 2) {
return 'nd';
}
if ($right == 3) {
return 'rd';
}
return 'th';
}
}
In this command, update the start date and end date (line 55) as per your requirement. Also, the quarter in this script run from November-January, February- April, May-July, August-October, this might be different for your use case, so it should be updated accordingly.
Conclusion
In this article you were able to design a date dimension table using Laravel. The migration, commands, etc. are database driver independent. So you can essentially create this table in MS SQL or MySQL or any other database supported by Laravel.
References
TESTING, BLOG
How to add PSR-2 code check as unit test
Who is this article for?
This article is for developers who are looking to add PSR-2 code check to their project.
Before You Begin
I am going to assume you know what PSR-2 coding style entails. If you are not fully aware of what PSR-2 is, please take time and read about it.
Below are some good reads -
- https://www.php-fig.org/psr/psr-2/
- https://docs.opnsense.org/development/guidelines/psr2.html
- https://devticks.com/why-standards-are-so-important-in-development-b3ef973b02ad
Any fool can write code that a computer can understand. Good programmers write code that humans can understand.
Refactoring: Improving the Design of Existing Code, 1999. Martin Fowler.
Now that you have a basic understanding of what PSR-2 is, let's jump into implementing it. We are going to implement it as a Unit test called CodeStyleLinterTest
.
Step 1 - Install Required Packages
PHP-CS-Fixer
This tool will verify your coding standards as defined in the PSR-1, PSR-2, etc., or other community-driven ones like the Symfony one. You can also define your (team's) style through configuration. You can read all about the package here - https://github.com/FriendsOfPHP/PHP-CS-Fixer. It even allows you to create custom rule-based on your requirements. Ok, so let's install it now to our PHP project
composer require friendsofphp/php-cs-fixer --dev
The Process Component
The Process component executes commands in sub-processes. You can also use exec()
but for security reason, I would recommend using the process component package.
composer require symfony/process --dev
As I am going to use these two packages only for my development purposes only, I have added them as dev dependencies of my composer.json
.
Step 2 - The Unit Test Case
Script
<?php
namespace Tests\Unit;
use Symfony\Component\Process\Process;
use function base_path;
use Tests\TestCase;
class CodeStyleLinterTest extends TestCase
{
// Add more path to test
const PATH_TO_TEST = [
'app',
'config',
'public/index.php',
'tests',
'database'
];
/** @test */
public function psr2()
{
$phpCsFixerPath = base_path('vendor/bin/php-cs-fixer');
// Let's check PSR-2 compliance for our code
foreach (self::PATH_TO_TEST as $path) {
$fullPath = base_path($path);
// Run linter in dry-run mode so it changes nothing.
$process = new Process([$phpCsFixerPath, 'fix', $fullPath, '--dry-run', '--allow-risky=no']);
$process->run();
// Exit code should be 0, else there is a problem with the PSR-2 compliance
$this->assertEquals(
0,
$process->getExitCode(),
$process->getOutput()
);
}
}
}
I created this unit test for my laravel app, but you can use it for any PHP project. You can update the PATH_TO_TEST
constant to include your directories or files you want to run the PSR-2 code check.
Conclusion
Having a simple PSR-2 does not ensure code quality; it only makes the code look pretty i.e., formatting. You should conduct regular code review session to go over your code.
Sources
- https://www.php-fig.org/psr/psr-2/
- https://docs.opnsense.org/development/guidelines/psr2.html
- https://devticks.com/why-standards-are-so-important-in-development-b3ef973b02ad
- https://github.com/FriendsOfPHP/PHP-CS-Fixer
- https://symfony.com/doc/current/components/process.html
- https://miro.medium.com/max/698/1*zXeTjYDcMSz-0w-os6qFUg.png
MYSQL, ACCESS, DATABASE, BLOG
How to effectively import large MS Access DB to MySQL DB
Who is this article for?
Anyone looking for how to import large MS Access DB to MySQL DB effectively.
Before You Begin
There are several options which we can use to import the data to MySQL. Few of those are listed in these articles below -
https://stackoverflow.com/questions/4809654/how-to-import-an-access-mdb-format-database-to-mysql
https://stackoverflow.com/questions/5465661/converting-mysql-to-ms-access
These options work well if the data that needs to be imported is small but is very slow for an extensive database. The reason the import is very slow is that it creates one insert statement per record. I recently had to convert a relatively large MS Access DB file to MySQL. The import process was prolonged, and I let it run overnight, but it didn't complete. So I had to come up with a different solution which can import the data much faster.
For this post, we can use the data provided by IPEDS. The file can be downloaded from here - https://nces.ed.gov/ipeds/use-the-data/download-access-database. We will use the 2017-18 Access file, which is the latest release at the time of writing this article.
Also, we are going to use mdbtools. I believe you can install it on Windows Operating system too - https://github.com/brianb/mdbtools/issues/107. Since I run a Linux distribution installing it was an easy setup.
Script
#!/bin/bash
#===========================================================
# FILE : script.sh
# DESCRIPTION : Convert MS Access to MySQL.
#===========================================================
set -o nounset # help avoid bugs
shopt -s extglob
PATH=/bin:/usr/bin:/sbin # for consistency
# variables declaration
DB_NAME=ENTER_YOUR_DB_NAME
DB_UN=ENTER_YOUR_DB_USER
PATH_TO_ACCESS_FILE=ENTER_THE_ABSOLUTE_PATH_TO_ACCESS_FILE
# set login path to suppress warnings
mysql_config_editor remove --login-path=local
mysql_config_editor set --login-path=local --host=localhost --user=$DB_UN --password
# get all the tables
TABLES=$(mdb-tables -1 $PATH_TO_ACCESS_FILE)
# drop tables if exits
echo "$(date +%Y%m%d_%H%M) DROPPING TABLE IF EXISTS"
for t in $TABLES
do
mysql --login-path=local $DB_NAME -Bse "DROP TABLE IF EXISTS $t;"
done
# create meta definition for the tables
rm -rf meta.sql
mdb-schema $PATH_TO_ACCESS_FILE mysql > meta.sql
# create the tables using the meta.sql file generated above
echo "$(date +%Y%m%d_%H%M) CREATING TABLES;"
mysql --login-path=local $DB_NAME < meta.sql
rm -rf meta.sql
echo "$(date +%Y%m%d_%H%M) IMPORTING DATA INTO TABLES;"
for t in $TABLES
do
echo "$(date +%Y%m%d_%H%M) IMPORTING DATA FOR $t;"
rm -rf $t.csv
mdb-export -D '%Y-%m-%d %H:%M:%S' $PATH_TO_ACCESS_FILE $t > $t.csv
mysqlimport --login-path=local --ignore-lines=1 --fields-terminated-by=, --fields-optionally-enclosed-by='"' --local $DB_NAME $t.csv
mysql --login-path=local $DB_NAME -Bse "ALTER TABLE $t CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;"
rm -rf $t.csv
done
echo "$(date +%Y%m%d_%H%M) COMPLETED;"
Conclusion
When I used this script, I was able to import the data in less than a few minutes. A general outline of how the script works.
- Creates
login path
to suppress warnings. - Drops tables if they exist from the specified database.
- Recreate the table metadata from MS Access.
- Export each table to
.csv
file. - Bulk import the
.csv
data to the specified table.
Limitations
- The above script is tested on a Linux distribution.
- mdbtools does not create indexes or foreign key references.
Sources
- https://stackoverflow.com/questions/5722544/how-can-i-convert-an-mdb-access-file-to-mysql-or-plain-sql-file
- https://stackoverflow.com/questions/4809654/how-to-import-an-access-mdb-format-database-to-mysql
- https://stackoverflow.com/questions/5465661/converting-mysql-to-ms-access
- https://nces.ed.gov/ipeds/use-the-data/download-access-database
- https://github.com/brianb/mdbtools
- https://github.com/lsgunth/mdbtools-win
JSON, PERFORMANCE, PHP, BLOG
PHP - Performance - Adjacency set to nested JSON
Before You Begin
In this article, we will compare the performance of Method 1 and Method 2 as used in - PHP – Convert Adjacency set to nested JSON.
We will be utilizing an extensive data set with approximately 500K records. I have downloaded these records from itis.gov and made some minor modification for this article. Data can be downloaded from here - https://drive.google.com/open?id=115_d5PV4OQ948cWP5TLXUpN2TvLvvo_y
We will be using MySQL CTE and break the data into chunks to test the performance for different size of records. You can refer to the CTE article here - MySQL – Adjacency List Model For Hierarchical Data Using CTE.
There is a slight change made to getDataFromDatabase()
method as we want to limit the records to test the performance. Below you can find the updated definition. I have added the placeholder inside the CTE as [ID_OF_ENTITY]
. This will be our starting point and then I will retrieve all descendants for this entity and convert it to a nested response.
Method 1
<?php
// Initialize php setting - @node - NOT RECOMMENDED FOR PRODUCTION, update your php.ini if needed.
error_reporting(E_ALL);
ini_set('display_errors', 1);
ini_set('memory_limit', '-1');
ini_set('max_execution_time', 30000);
// Get the tree array from DB
$treeArray = getDataFromDatabase();
// Transform the data
$outputTree = transformTree($treeArray, [ID_OF_ENTITY]);
// Output the response
header('Content-Type: application/json');
echo json_encode($outputTree[0]);
/**
* Transform the tree
*
* @param $treeArray
* @param null $parentId
* @return array
*/
function transformTree($treeArray, $parentId = null)
{
$output = [];
// Read through all nodes of the tree
foreach ($treeArray as $node) {
// If the node parent is same as parent passed in argument
if ($node['parent_id'] == $parentId) {
// Get all the children for that node, using recursive method
$children = transformTree($treeArray, $node['id']);
// If children are found, add it to the node children array
if ($children) {
$node['children'] = $children;
}
// Add the main node with/without children to the main output
$output[] = $node;
// Remove the node from main array to avoid duplicate reading, speed up the process
unset($node);
}
}
return $output;
}
/**
* Get all records from DB and return array
*
* @return array
*/
function getDataFromDatabase()
{
// Create database connection
$dbConnection = new mysqli("localhost", "root", "secret", "adjacency");
// Get the result from DB Table
$records = $dbConnection->query("
WITH RECURSIVE tree AS
(
SELECT id, name, parent_id
FROM entities
WHERE id = [ID_OF_ENTITY]
UNION ALL
SELECT e.id, e.name, e.parent_id
FROM tree AS t
JOIN entities AS e ON t.id = e.parent_id
)
SELECT id, name, parent_id FROM tree;
");
// Fetch all records
// @MYSQLI_ASSOC - Columns are returned into the array having the field name as the array index.
$output = mysqli_fetch_all($records, MYSQLI_ASSOC);
// Close the connection
$dbConnection->close();
return $output;
}
Method 2
<?php
// Initialize php setting - @node - NOT RECOMMENDED FOR PRODUCTION, update your php.ini if needed.
error_reporting(E_ALL);
ini_set('display_errors', 1);
ini_set('memory_limit', '-1');
ini_set('max_execution_time', 30000);
// Get the tree array from DB
$treeArray = getDataFromDatabase();
// Group by parent id
$treeArrayGroups = [];
foreach ($treeArray as $record) {
$treeArrayGroups[$record['parent_id']][] = $record;
}
// Get the root
$rootArray = $treeArray[0];
// Transform the data
$outputTree = transformTree($treeArrayGroups, $rootArray);
// Output the response
header('Content-Type: application/json');
echo json_encode($outputTree);
/**
* Transform the tree
*
* @param $treeArrayGroups
* @param $rootArray
* @return mixed
*/
function transformTree($treeArrayGroups, $rootArray)
{
// Read through all nodes where parent is root array
foreach ($treeArrayGroups[$rootArray['id']] as $child) {
// If there is a group for that child, aka the child has children
if (isset($treeArrayGroups[$child['id']])) {
// Traverse into the child
$newChild = transformTree($treeArrayGroups, $child);
} else {
$newChild = $child;
}
// Assign the child to the array of children in the root node
$rootArray['children'][] = $newChild;
}
return $rootArray;
}
/**
* Get all records from DB and return array
*
* @return array
*/
function getDataFromDatabase()
{
// Create database connection
$dbConnection = new mysqli("localhost", "root", "secret", "adjacency");
// Get the result from DB Table
$records = $dbConnection->query("
WITH RECURSIVE tree AS
(
SELECT id, name, parent_id
FROM entities
WHERE id = [ID_OF_ENTITY]
UNION ALL
SELECT e.id, e.name, e.parent_id
FROM tree AS t
JOIN entities AS e ON t.id = e.parent_id
)
SELECT id, name, parent_id FROM tree;
");
// Fetch all records
// @MYSQLI_ASSOC - Columns are returned into the array having the field name as the array index.
$output = mysqli_fetch_all($records, MYSQLI_ASSOC);
// Close the connection
$dbConnection->close();
return $output;
}
Performance Comparision
For performance comparison, I am going to use blackfire.io. We will be ignoring the performance of all other components like MySQL except for transformTree().
Results
ID_OF_ENTITY = 846539, ~5,000 DESCENDANTS
# | CALLS FOR transformTree() |
EXECUTION TIME | I/O WAIT TIME | CPU | PEAK MEMORY |
---|---|---|---|---|---|
Method 1 - Result | 4,960 | 1.58 s | 58.9 ms | 1.52 s | 3.88 MB |
Method 2 - Result | 847 | 88 ms | 79.4 ms | 8.65 ms | 4.41 MB |
ID_OF_ENTITY = 846542, ~10,000 DESCENDANTS
# | CALLS FOR transformTree() |
EXECUTION TIME | I/O WAIT TIME | CPU | PEAK MEMORY |
---|---|---|---|---|---|
Method 1 - Result | 12,014 | 21.3 s | 40 ms | 21.2 s | 8.28 MB |
Method 2 - Result | 1,807 | 185 ms | 158 ms | 27.4 ms | 10.3 MB |
ID_OF_ENTITY = 846535, ~20,000 DESCENDANTS
# | CALLS FOR transformTree() |
EXECUTION TIME | I/O WAIT TIME | CPU | PEAK MEMORY |
---|---|---|---|---|---|
Method 1 - Result | 19,059 | 1 min 5 s | 82.3 ms | 1 min 5 s | 13.6 MB |
Method 2 - Result | 3,061 | 179 ms | 122 ms | 57.6 ms | 16.8 MB |
ID_OF_ENTITY = 846504, ~50,000 DESCENDANTS
# | CALLS FOR transformTree() |
EXECUTION TIME | I/O WAIT TIME | CPU | PEAK MEMORY |
---|---|---|---|---|---|
Method 1 - Result | 54,703 | 11 min 33 s | 1.03 s | 11 min 32 s | 37.4 MB |
Method 2 - Result | 8,589 | 534 ms | 404 ms | 129 ms | 47.1 MB |
Conclusion
As you can see from the above result set, Method 2 is significantly faster than Method 1 when it comes to big data set.
I tried to run Method 1 on the complete data set with ~ 500K records, and I waited for more than an hr, but it never executed. On the flip side - Method 2 was still able to complete within 10 seconds.
The main reason for Method 2 being superior is because of the recursive method called significantly less than Method 1.
In future articles, I will talk about how you can best mock any third party service while running PHPUnit tests inside Laravel. As making a real connection slows down the complete suite.
JSON, PHP, BLOG
PHP - Convert Adjacency set to nested JSON
Who is this article for?
Anyone looking for how to convert an adjacency set to a nested JSON response. Different types of data can be stored as adjacency set e.x. Menu items, any kind of hierarchy, etc.
In this article, I’m going to walk through on a couple of algorithms written in PHP.
Before You Begin
We will be utilizing the data definition and structure from this article - MySQL – Adjacency List Model For Hierarchical Data Using CTE. You can refer to this post to create the MySQL Table and populate data.
Method 1
<?php
// Get the tree array from DB
$treeArray = getDataFromDatabase();
// Transform the data
$outputTree = transformTree($treeArray);
// Output the response
header('Content-Type: application/json');
echo json_encode($outputTree[0]);
/**
* Transform the tree
*
* @param $treeArray
* @param null $parentId
* @return array
*/
function transformTree($treeArray, $parentId = null)
{
$output = [];
// Read through all nodes of the tree
foreach ($treeArray as $node) {
// If the node parent is same as parent passed in argument
if ($node['parent_id'] == $parentId) {
// Get all the children for that node, using recursive method
$children = transformTree($treeArray, $node['id']);
// If children are found, add it to the node children array
if ($children) {
$node['children'] = $children;
}
// Add the main node with/without children to the main output
$output[] = $node;
// Remove the node from main array to avoid duplicate reading, speed up the process
unset($node);
}
}
return $output;
}
/**
* Get all records from DB and return array
*
* @return array
*/
function getDataFromDatabase()
{
// Create database connection
$dbConnection = new mysqli("localhost", "root", "secret", "adjacency");
// Get the result from DB Table
$records = $dbConnection->query("SELECT id, parent_id, name FROM category ORDER BY parent_id");
// Fetch all records
// @MYSQLI_ASSOC - Columns are returned into the array having the field name as the array index.
$output = mysqli_fetch_all($records, MYSQLI_ASSOC);
// Close the connection
$dbConnection->close();
return $output;
}
Method 2
<?php
// Get the tree array from DB
$treeArray = getDataFromDatabase();
// Group by parent id
$treeArrayGroups = [];
foreach ($treeArray as $record) {
$treeArrayGroups[$record['parent_id']][] = $record;
}
// Get the root
$rootArray = $treeArray[0];
// Transform the data
$outputTree = transformTree($treeArrayGroups, $rootArray);
// Output the response
header('Content-Type: application/json');
echo json_encode($outputTree);
/**
* Transform the tree
*
* @param $treeArrayGroups
* @param $rootArray
* @return mixed
*/
function transformTree($treeArrayGroups, $rootArray)
{
// Read through all nodes where parent is root array
foreach ($treeArrayGroups[$rootArray['id']] as $child) {
// If there is a group for that child, aka the child has children
if (isset($treeArrayGroups[$child['id']])) {
// Traverse into the child
$newChild = transformTree($treeArrayGroups, $child);
} else {
$newChild = $child;
}
// Assign the child to the array of children in the root node
$rootArray['children'][] = $newChild;
}
return $rootArray;
}
/**
* Get all records from DB and return array
*
* @return array
*/
function getDataFromDatabase()
{
// Create database connection
$dbConnection = new mysqli("localhost", "root", "secret", "adjacency");
// Get the result from DB Table
$records = $dbConnection->query("SELECT id, parent_id, name FROM category ORDER BY parent_id");
// Fetch all records
// @MYSQLI_ASSOC - Columns are returned into the array having the field name as the array index.
$output = mysqli_fetch_all($records, MYSQLI_ASSOC);
// Close the connection
$dbConnection->close();
return $output;
}
Output
{
"id": "1",
"parent_id": null,
"name": "Electronics",
"children": [
{
"id": "2",
"parent_id": "1",
"name": "TV",
"children": [
{
"id": "3",
"parent_id": "2",
"name": "Smart"
},
{
"id": "4",
"parent_id": "2",
"name": "4K Ultra HD"
},
{
"id": "5",
"parent_id": "2",
"name": "Curved"
}
]
},
{
"id": "6",
"parent_id": "1",
"name": "Camera"
},
{
"id": "7",
"parent_id": "1",
"name": "Computer",
"children": [
{
"id": "8",
"parent_id": "7",
"name": "Desktop"
},
{
"id": "9",
"parent_id": "7",
"name": "Laptops",
"children": [
{
"id": "10",
"parent_id": "9",
"name": "Work"
},
{
"id": "11",
"parent_id": "9",
"name": "Travel"
},
{
"id": "12",
"parent_id": "9",
"name": "All Around"
},
{
"id": "13",
"parent_id": "9",
"name": "Gaming"
}
]
},
{
"id": "14",
"parent_id": "7",
"name": "Tablet"
}
]
}
]
}
Performance Comparision
For performance comparison, I am going to use blackfire.io. We will be ignoring the performance of all other components like MySQL except for transformTree()
. To get more accurate results, I am going to use several sample size, in this case, I used 10.
Method 1
Method 2
Basic Result
# | EXECUTION TIME | I/O WAIT TIME | CPU | PEAK MEMORY |
---|---|---|---|---|
Method 1 | 734 µs | 376 µs | 358 µs | 44.2 kB |
Method 2 | 762 µs | 392 µs | 371 µs | 44.8 kB |
transformTree()
Recursive Calls for # | TOTAL DB RECORDS | CALLS FOR transformTree() |
---|---|---|
Method 1 | 14 | 15 |
Method 2 | 14 | 4 |
Conclusion
As you can see both Method 1 and Method 2 generates the same output. These can be used with any type of dataset. It performs recursive operations and produces output for any depth of data.
Although, we don't really see the benefit of using Method 2 over Method 1 as the data set used is very small. The difference in performance is almost similar for small dataset.
But, I would strongly recommend using Method 2, as this method is far more superior and faster.
In my next article, we will dive deep into comparing the performance of these two methods.