The advancement of any topic of research depends on intelligence. What was learned yesterday, or even just a few minutes ago, can have strong influence on the next best research decision. Yet when it comes to collecting “all that should be known” about a given subject, the information landscape is both firmly un-structured and utterly daunting in terms of sheer volume.

Researchers need pertinent data, in real-time, and in familiar formats that allow for fast recognition, quick consumption and deep-well-digging where desired. They need to stay on top of what is happening in their field and in their company, yet they don’t have the time to sift through mountains of data to find what they need.

In answer, the link between CIOs, data scientists and research teams needs to grow. Greater focus needs to be given to the information environment surrounding research teams. To aid in this, sophisticated, data aggregation/curation/distribution tools can be brought to bare, which are designed to deliver highly-relevant information in a transparent fashion. Working to navigate the information flood, the result of such internal changes will breed greater information awareness, intercompany trust and collaborative thinking.

The Growing Volume of Data and its Lack of Structure

Many think of the internet as a well-organized and complete mass of linked information. In actual (and surely no surprise to many reading this blog), it is more of a rogue landscape with deep and hidden layers. Depending on ones route and method of search-intended access, widely different results can be achieved. It is a fledgling system that requires care and sophistication in its approach.

Further, we’ve all heard about how fast data is being generated–following and/or exceeding Moores Law by most measurements. The International Data Corporation (IDC) estimates that:

“The volume of digital data will grow 40% to 50% per year and by 2020, the number will have reached 40,000 Exabytes (EB), or 40 Zettabytes (ZB). At this rate, the world’s information is doubling every two years and by 2020 the world will generate 50 times the amount of information it does today and 75 times the number of information containers.”

That’s a lot of 1’s and 0’s and the vast preponderance of it is unstructured, and will remain unstructured, until gathered, curated and processed.

For quick review, Wikipedia defines unstructured data as follows:

“Unstructured Data (or unstructured information) refers to information that either does not have a pre-defined data model or is not organized in a pre-defined manner. Unstructured information is typically text-heavy, but may contain data such as dates, numbers, and facts as well.”

UnstructuredDataGrowth_datasciencecentral.com

The exponential growth of unstructured data presents both issues and opportunities in research. Having the right tools to manage the flood and deliver real-time, relevant content will be an increasing difference-maker.
Image Courtesy: datasciencecentral.com

 

Putting this massive, (largely) unstructured data further into perspective, here’s a quote from Jeff Stone and Charles Poladian, from a recent article they co-wrote for the International Business Times:

“When most users log on to the Internet, they visit their bookmarked pages or use Google to search for sites that can provide the information they’re looking for. That Internet, used by billions around the world every day, is sometimes known as the Surface Web, or the Clearnet, as coined by Tor and other anonymous online users. The Deep Web, simply put, is everything else. It’s made up of tens of billions of sites that are hidden within a universe of code — various estimates have put the Deep Web at anywhere from five to 500,000 times the size of the Surface Web.”

Fortunately, when looking for information, there are excellent data aggregation/curation/delivery tools for reaching all throughout the system of digitally-connected machines. These tools not only gather information from within an organization, they reach deeper into the web, accessing public, private and subscription-based content that traditional search engines are simply not connected to. This comprehensive, system-wide approach can yield volumes more of unstructured data that must be made sense of in terms of relevance (and then disseminated in real-time).

Information Relevance Through Advanced Boolean Logic

Traditional search engines use a ranking system to display search results. Driven by payola-based campaigns like Google Adwords, or by sheer content volume and site speed, the list of results in these type of searches greatly lack in relevance. In stark contrast, the more sophisticated, data aggregation/curation/delivery tools available today, and of which go well beyond the surface (web), are driven by advanced boolean logic algorithms.

Known as the “mathematics of logic,” boolean expressions allow for condition upon condition to be layered into a given search query, making possible very targeted results. Indeed, the better the logic at work within given a organization’s data aggregation tool, the more relevant that company’s received intelligence. Such proprietary algorithms are well-protected secrets at any data aggregation/curation firm and are one of the key difference makers in that industry.

Basic-Identities-of-Boolean-Algebra

The use of advanced Boolean logic to filter out targeted, relevant intelligence is a key difference maker in keeping research teams aware in their field.
Image Courtesy: wikipedia.com

 

With the vast amount of data able to be queried when looking both internally and at multiple (and deeper) layers of the web, a highly-sophisticated aggregation/curation/delivery tool is exactly what researchers need on-board; they simply don’t have time to waste filtering out information that is irrelevant. The job of determining relevance needs to be done before they view their intelligence feed. Achieving an efficiency level of 100% in this realm is (of course) next to impossible, yet it remains the aim.

Information Delivery: Targeted and Transparent

As information flows into an organization it must be curated and distributed into a secure, collective environment that fosters information sharing. Often referred to as “groupthink,” through such “information transparency,” the collective intelligence and trust across an organization can flourish well-beyond the sum of its individual parts. Why? Because people are not (generally) one-dimensional and can offer ideas even far outside their main sector of focus. Further, people are curious and they don’t like secrets (unless they’re in on it). Open information; a lack of proverbial walls, builds trust.

In a recent article he wrote for Inc.com, writer James Kerr states:

“Trust provides strength against adversity for a business. People just seem to pull together when they trust one another. Problems are addressed head-on–with no excuses made or expected. Work becomes play in high trust companies. It is fun to see what can be accomplished when everyone works together to achieve a common goal. Trust-building begins with honest and transparent communication. Business leaders need to integrate transparent communications into their businesses.

By breaking down the past information barrier of the highly-restrictive “need-to-know-basis,” employees that have access to wider information relevant to the company, will be happier, more creative, energized, secure and broadly thinking. Indeed, some of the greatest breakthroughs in history have come from laymen bringing a fresh, outside-of-the-box approach. At the very least, a greater sense of community will be built within the organization.

Proper information aggregation and delivery tools can achieve such collaborative transparency, while still directly transferring only what is needed to given individuals or groups. This method of breaking down walls and getting all on board to think about issues and challenges facing the company can also be termed “information cross-training.” Who knows? Bill in nano engineering might just solve the marketing issue facing Susie, through a unique and thoughtful approach. Isolation is out…groupthink is in.

Finally, a Critical Measure of Effectiveness

Even with the best data aggregation/curation/distribution tools brought to bare, such tools must also contain a way to measure both the effectiveness of their delivery and the ultimate information consumption. What devices are being used and who is using them? How timely and relevant is the information? Just how much awareness is being achieved?

As information curation, delivery and consumption is studied in detail, opportunities for more-targeted deliveries, specific device focus and greater personalization can be identified by the data aggregation tool. Here again, time is an X factor…and if tweaks to the curation and delivery side of an intelligence stream can both increase relevance and save time, this part of the equation takes on tremendous value. So begins a continuously-running loop, where relevant information is delivered, consumed, and (the process) analyzed and adjusted for maximum effectiveness in the next round.

Information-Awareness-Flow_03_B&W

Data aggregation/curation/delivery tools that contain all of the facets outlined here today will serve as growing difference makers for many research-based companies seeking to maintain a leadership position in their industry. With such a tool properly employed, relevant information awareness can spread with increasing, analytically-driven efficiency and allow for inter-company trust and intelligence to flourish.