What it is
Of interest to government, politicians, policy makers, think tanks, university analysts and researchers who need authoritative historical information on criminal justice, national security, medical industry, business, economics, culture, and society. It is an Enterprise Relational Database, using the latest features of SQL Server 2012 and prior. Well-designed with my expertise in enterprise and data architecture, business intelligence, reporting, and software programming.
Why it was done
Microsoft Excel is fairly primitive but very efficient and convenient for research. It handled all of my needs through nearly 700 pages of manuscript for two separate books on culture and economics. In late summer of 2015, as my final analyses came to a close, I began the most ambitious analysis of my research to date – American Security (and its relation to our economy), which includes National Security and Criminal Justice topics. I even got away with using some VBA code to do some tricky operations on the data, reminding me of how much I loathe the brute characteristics of the VB programming language.
One night it all crashed. Excel started locking up. It corrupted the files. I barely salvaged the data, and even after that, the program continuously locked up. A simple copy-paste of one cell would leave Excel spinning for an hour. My American Security files had too much data, too many formulas, and too much of…whatever Microsoft deemed out of the use cases of their program (wisely, as my voracious demands are very unique). I can attest that this has nothing to do with my incredible machine, which is powerful enough to run three servers at a time, a software development environment, office applications, and even a game, simultaneously.
The decision to move all of the research into a formal data design architecture and an enterprise database, was costly. I knew it would take a few extra weeks of labor, but it had to be done to finish the research. I have been planning for such a powerful solution for a long time, but was holding off until funding was acquired. Alvarism research is unique and valuable, and should be delivered in a robust, expandable, and easily updated system that can be used by multiple contributors and thousands of consumers.
As an engineer, business owner, and entrepreneur, I am apt to pursue maximum pragmatism. It’s a part of my nature by now. It would’ve been foolish to start with a Ferrari when a Prius could get me to where I was going sufficiently. Sometimes we can’t predict extraordinary circumstances. But just as easily as the road block smashed my Excel Prius to hubcaps and chassis (Flintstone style), I was knee deep in the next solution. I couldn’t imagine the nightmare this would have been if I didn’t have database administration (DBA), data design, and enterprise database programmability skills. If I was a liberal arts professor, I’d have needed to solicit other professors from the computer science and engineering departments to even move forward. While the setback was frustrating, I was very grateful that the delay was only temporary.
Thus the Alvarism Database was born in fire, like so many great, annealed technology gambits.
Geocoding of location point data for key areas – centroids of national boundaries, statistical and administrative divisions for most of the world, regions, and expandability for any geographic analysis need. Easy export to GIS software like Quantum GIS for maps, visualization, and animation. Postal code regular expressions, capital, and ISO codes for countries and administrative divisions.
Code to handle the storage and transformation of BC/AD dates. At best, built-in data types can handle years 1 AD through 9999 AD. Using the Date data type, a BC Offset, and some excellent code, BC dates can be transformed to work with other AD-era dates in any analysis. The storage mechanism offsets the BC dates, and transformations keep the time distance between AD date ranges, while the displayed readable dates are tracked in parallel. The correct BC/AD date is displayed, and a serial offset date counter is assigned to maintain correct relation of all dates returned from the query. The absence of Zero AD is also handled by the code.
Various demographic lists such as gender, race, religion
Incorporation dates of administrative divisions (such as States in the United States). This is helpful to fix some faulty analyses I’ve read. An example is the claim that east coast and southern states have more executions than other states. The claimants did not normalize the data for how many years the states have been incorporated, so they are not comparing apples-to-apples. Comparing any entity to another that was statistically invisible for over the half of the same time range, is a gross form of error, leading to completely wrong conclusions.
Inflation conversion factors
US GDP from 1790 to present day
Crime rates back to 1900, acquired through my special projections (since prior to classifications in the 1960s, the only reliable crime statistic was homicide). As it turns out there are ways to model other categories of crime, using the homicide rate.
Executions back to 1608, synthesized from the Espy File and separate government sources. Name, crime, method, date, gender, race for each executed prisoner. Victim demographics for most criminals. Synthesis of records from multiple sources disambiguated and de-duplicated with data transformation code.
Criminal offenses, categories, and execution methods and categories
Criminal Justice government spending back to the 18th century
Lynching vs. Civil Authority executions back to the 19th century. Useful for a look at Honor Culture now and then.
Hate Crime statistics
National Security & Politics
Democide architecture derived from Professor Rummel’s excellent work at the University of Hawaii. Using his life’s crown achievement, a design was produced to handle the complexity of his data, and expand its usefulness. Improved with best point estimate math, and other innovations.
Ability to handle multiple state actors in a Democide event, according to their roles
Ability to record regimes across nations over time, based on the best match for a modern predecessor. Expandable for geographic areas to cover multiple modern predecessors.
Political ideology and authority demographics for regimes (based on Hans-Slomp)
Ability to record cultural demographics of regimes over time
Integration of the Global Terrorism Database
Attack types and target types
A history of American conflicts – from full scale wars to Low-Intensity-Conflicts (LICs). Casualties, dates, war type, missing soldiers, locations of the conflict, enemies killed, and victory outcomes
An estimate of the global history of war casualties (excluding democide and terrorism) from 3000 BC to present day, per century (as noted by professor Rummel)
Border change events from the 19th century until today, from war or hostile statecraft. Expandable to receive more historical border changes, and geographic data objects to precisely describe the exact land that was gained or lost, and the regime that won or lost the land.
These features represent data that has already been used in Excel to produce analytical results for the two Alvarism book manuscripts, but should be moved into the Alvarism database in the future, when the project receives institutional finance.
Government Spending and Taxation Data
Culture Industrial Complex economic data (the entire analysis for the Alvarism Part 2 manuscript, dealing with all cultural industries). Includes Education (DoEd), Religion, Civics (nonprofits, government, political spending), Journalism, and Entertainment.
American Time Use Analysis data (complemented by other published academic research to cover age ranges that are outside of the scope of the ATUS).
Census data syntheses
BEA data syntheses
World Bank Development Indicator syntheses
Bureau of Labor Statistics (BLS) data syntheses
Collaboration, Grants, and Financing
If you have an interest in the Alvarism Database, any of its features, or techniques I used to implement it, please contact me here.