Excel Problems: Bulk Transformation Pains
Raw data is the raw material data analysts use to produce insights for decision makers. As with any raw material, raw data needs to be prepared before any analysis can be built on top of it.
Analysts who use Excel for business intelligence know it all too well that there is no straight forward way of extracting insights. For instance, let’s suppose that column A contains postal codes of North American cities and the task at hand is to populate column B with the name of the corresponding states.
This is a tedious and painful exercise in Excel which is likely to throw up several errors due to issues ranging from pin code numbers being stored as text to an invisible dash after the code because of human input error. Getting around such issues requires complicated skills and a lot of battle experience in Excel.
One of the biggest data transformation problems, analysts face with Excel is joining data from different sources. VLOOKUP and HLOOKUP functions are used for merging data but the slightest of mismatch in formats or arrangement of data will cause errors that need to be rectified.
Data wrangling pains are further compounded if you are breaching Excel’s rows and columns limitations. Excel was designed for accountants, not analysts, it will drain your personal computer’s computing power and running a macro can take several minutes. If you cram multiple functions in one formula, going for a coffee break would be advisable while Excel takes its own sweet time to produce results. In an earlier post, we had pointed out why Excel stops businesses from reaching their true potential.
Data visualization is another data transformation activity for which Excel does not provide the best options. Common problems data analysts face in this regard are –
- Inflexible data visualizations
- Inadequate chart selections
- Limited data point rendering capabiity
Anything Excel cannot handle is a piece of cake for DataScout. So, if the overall aim for the above exercise was to figure out crime rates in each state depending on the pin codes they were reported at, DataScout will do it in a flash. DataScout can do this because it is built around a big data architecture that is not a slave to formatting requirements.
Moreover, DataScout will effortlessly let you deep dive into your data if you wanted to further explore –
- Time ranges that have the most density of crimes
- Battery, manslaughter, robbery – you’ll get the crime category split for each state and city
- Average response times
- Average number of officers reporting at the crime scene within 10 minutes
If the data is there, DataScout will spill it all out without requiring you to write a single line of code. It’s a self-service BI tool that is as intuitive as a smartphone, give it a try.