<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>answersy.com Blog &#187; Data</title>
	<atom:link href="http://answersy.com/zchen/index.php/category/it-related/data/feed/" rel="self" type="application/rss+xml" />
	<link>http://answersy.com/zchen</link>
	<description>Got questions? Got answers!</description>
	<lastBuildDate>Thu, 20 Oct 2011 04:28:21 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>How to release the value of data?</title>
		<link>http://answersy.com/zchen/2008/11/25/how-to-release-the-value-of-data/</link>
		<comments>http://answersy.com/zchen/2008/11/25/how-to-release-the-value-of-data/#comments</comments>
		<pubDate>Tue, 25 Nov 2008 22:48:45 +0000</pubDate>
		<dc:creator>zchen</dc:creator>
				<category><![CDATA[Data]]></category>
		<category><![CDATA[E-Biz]]></category>
		<category><![CDATA[IT Related]]></category>
		<category><![CDATA[Internet]]></category>
		<category><![CDATA[Random Thoughts]]></category>

		<guid isPermaLink="false">http://answersy.com/zchen/2008/11/25/how-to-release-the-value-of-data/</guid>
		<description><![CDATA[Cloud computing—providing scalable Internet-based services—is currently a hot topic. Today I am going to look at "cloud computing" from a different angle: data. Traditional businesses invest a lot in large data warehouse to capture and analyze transaction data. However, the loop of feedback is an open one. Executives make decisions based on the insights from [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Cloud computing</strong>—providing scalable Internet-based services—is currently a hot topic.  Today I am going to look at "cloud computing" from a different angle: data.</p>
<p>Traditional businesses invest a lot in large data warehouse to capture and analyze transaction data. However, the loop of feedback is an open one. Executives make decisions based on the insights from the data and execution is a separated pipeline.</p>
<p>For internet-based services, the success is usually measured by its traffic volume and quality. The implication of traffic <strong><em>volume </em></strong>is scalability while that of <strong><em>quality</em></strong> is relevancy, namely whether the delivered contents are relevant to the viewers' interests. To achieve high relevancy, web sites collect a lot of user behavioral data. These raw data need to be aggregated to become knowledge that can be utilized in delivering contents.</p>
<p>In short, the key competence of an Internet service is to close the feedback loop and make data part of the service delivery system itself.  In the meanwhile, everything in the pipeline should be able to scale up to meet large amount of requests. That is why cloud computing is so vital to an Internet company: data <strong>storage</strong>, <strong>analysis </strong>and <strong>serving </strong>should all be moved to cloud-based infrastructure so that the true value of data can be fully unleashed.<br />
<img alt="Data Layers" id="image214" src="http://answersy.com/zchen/wp-content/uploads/2008/11/taobao-data-layer2.jpg" /></p>
<script type="text/javascript">
  addthis_url    = 'http%3A%2F%2Fanswersy.com%2Fzchen%2F2008%2F11%2F25%2Fhow-to-release-the-value-of-data%2F';
  addthis_title  = 'How+to+release+the+value+of+data%3F';
  addthis_pub    = 'zchen050815';
</script><script type="text/javascript" src="http://s7.addthis.com/js/addthis_widget.php?v=12" ></script>
]]></content:encoded>
			<wfw:commentRss>http://answersy.com/zchen/2008/11/25/how-to-release-the-value-of-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How to Make BI Successful?</title>
		<link>http://answersy.com/zchen/2008/07/29/how-to-make-bi-successful/</link>
		<comments>http://answersy.com/zchen/2008/07/29/how-to-make-bi-successful/#comments</comments>
		<pubDate>Tue, 29 Jul 2008 20:26:57 +0000</pubDate>
		<dc:creator>zchen</dc:creator>
				<category><![CDATA[Data]]></category>
		<category><![CDATA[IT Related]]></category>
		<category><![CDATA[Random Thoughts]]></category>

		<guid isPermaLink="false">http://answersy.com/zchen/2008/07/29/how-to-make-bi-successful/</guid>
		<description><![CDATA[10 keys to a Successful Business Intelligence Strategy 1. Choose a C-level sponsor (who’s not the CIO). Business intelligence implementations should absolutely not be sponsored by anyone in IT. Instead, BI should be sponsored by an executive who has bottom-line responsibility; has a broad picture of the enterprise objectives, strategy and goals; and knows how [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.cio.com/article/148000/_Keys_to_a_Successful_Business_Intelligence_Strategy">10 keys to a Successful Business Intelligence Strategy</a><br />
<strong>1. Choose a C-level sponsor (who’s <em>not</em> the CIO).</strong> Business intelligence implementations should absolutely not be sponsored by anyone in IT. Instead, BI should be sponsored by an executive who has bottom-line responsibility; has a broad picture of the enterprise objectives, strategy and goals; and knows how to translate the company mission into key performance indicators that will support that mission. This executive is often the CFO. This sponsor should govern the implementation with a documented business case and be responsible for changes in scope.</p>
<p><strong>2. Create common definitions.</strong> Without common definitions, a BI implementation cannot succeed. And lack of agreement is a widespread problem in companies today. For example, finance and sales may define “gross margin” differently, which means that numbers will not match—in effect, negating the value of automation. To combat this problem, get subject matter expertise throughout lines of business from front-, middle- and back-office staff. At this stage, IT's participation should be limited to running the project management office and taking ownership of compliance and business standards and policies. Secondly, <em>start small and choose only 10 to 20 key performance indicators</em> and create standards and governance with them in mind.</p>
<p><strong>3. Assess the current situation.</strong> You should analyze the current business intelligence stack and processes and organizational structures surrounding current BI implementations. Both IT and the business should be involved.</p>
<p><strong>4. Create a plan for data storage.</strong> Many organizations begin with an isolated data mart, since it’s quick and cheap, but consider that this tactic means additional silos will need to be created as additional data storage needs arise, which can grow out of control within a few years. Something else to consider is whether to build and maintain a physical data warehouse or go with the virtual, <em>so-called “semantic” layers to link operational systems</em>. Traditional data warehousing means duplicating data, which means bringing in operations systems in real time will be next to impossible. You can save space with an abstract definition layer, but this is difficult to design, as is any metadata repository. Before even considering which vendors to choose, you must resolve this issue.</p>
<p><strong>5. Understand what users need.</strong> The three broad classes of business intelligence users are <strong>strategic</strong>, <strong>tactical </strong>and <strong>operational</strong>. Strategic users make few decisions, but each one can have a profound effect—for example, should we close operations in Europe and open them in China. Tactical users make many decisions a week, and use both aggregate and detail-level information, and likely need updated information daily. Operational users are the front-line employees, such as call center staff. They need data within their own set of applications to execute the enormous numbers of transactions. Understanding who will use BI and for what purposes can show the type of information needed and its frequency, and help guide BI decision making.</p>
<p><strong>6. Decide whether to buy or build the analytical data model.</strong> One size does not fit all. In general you may benefit from an out-of-the-box, industry-specific data model if you have a more homogeneous IT environment—such as one ERP, one CRM system. Do watch for extensibility and hierarchy flexibility. More complex enterprises may benefit from customization, although you may still want to consider beginning with an industry-standard model as a template or a set of guides (such as typical facts, dimensions and so on).</p>
<p><strong>7. Consider all business intelligence components.</strong> Components that affect the success of business intelligence implementations include: metadata, data integration, data quality, data modeling, analytics, centralized metrics management, presentations (reports and dashboards), portals, collaboration, knowledge management and master data management. Be sure to define the architecture for all layers of the business intelligence stack; even though they may not be part of the BI strategy itself, they will effect the success of implementation.</p>
<p><strong>8. Choose a systems integrator.</strong> Business intelligence implementations require guidance from a partner who has deep experience. Be prepared to spend $5 to $7 on services for every $1 on software, and cautions: Do not outsource the fine-tuning of business intelligence. The process requires a high degree of collaboration among end users, analysts and developers.</p>
<p><strong>9. Think “actionable” and “baby steps.”</strong> Choose an end user, business analyst and developer to create a first proof of concept within a few days. Choose a few key performance indicators and build a few reports, then add new releases every few weeks.</p>
<p><strong>10. Choose low-hanging fruit to start.</strong> Choose high-value, simple components to begin. For example, a sales analytics data mart may present high-value targets that also have plenty of existing models and best practices.</p>
<script type="text/javascript">
  addthis_url    = 'http%3A%2F%2Fanswersy.com%2Fzchen%2F2008%2F07%2F29%2Fhow-to-make-bi-successful%2F';
  addthis_title  = 'How+to+Make+BI+Successful%3F';
  addthis_pub    = 'zchen050815';
</script><script type="text/javascript" src="http://s7.addthis.com/js/addthis_widget.php?v=12" ></script>
]]></content:encoded>
			<wfw:commentRss>http://answersy.com/zchen/2008/07/29/how-to-make-bi-successful/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data Warehouse (1)</title>
		<link>http://answersy.com/zchen/2008/04/09/data-warehouse-1/</link>
		<comments>http://answersy.com/zchen/2008/04/09/data-warehouse-1/#comments</comments>
		<pubDate>Wed, 09 Apr 2008 21:45:18 +0000</pubDate>
		<dc:creator>zchen</dc:creator>
				<category><![CDATA[Data]]></category>
		<category><![CDATA[Design Doc]]></category>
		<category><![CDATA[IT Related]]></category>
		<category><![CDATA[Random Thoughts]]></category>

		<guid isPermaLink="false">http://answersy.com/zchen/2008/04/09/data-warehouse-1/</guid>
		<description><![CDATA[基础设施构架 基础设施要好用、够用。这里着重要考虑的是总的数据量，进出流量以及增长率。换句话说，数据仓库最终要装多少数据，到底会承受怎样的输入输出压力，随着时间推移总量和输入输出的压力如何变化。通常用现成的Oracle RAC，按照OLAP来配置；也可以采用免费DBMS加中间件的方式组成系统；甚至连整个存储系统都自行构建，例如采用Hadoop。显然，Oracle搭建起来比较快，但成本相对高；后者需要相当的人力资源投入，但可以掌握实际技术，灵活性高。 原始数据清洗 主要是过滤噪音、打标签、补足缺失部分 数据导入 分布、存储、索引 数据归总 按照商业或者分析的需求计算统计值 数据仓库设计 明确商务流程 必须对具体的商务活动本身要有深入的了解和认知。 确认商务流程中的各元素和维度 理解系统中人、物、事件、活动以及之间的逻辑关系。确定如何用各种数据参数来描述每个元素和事件活动。好的数据仓库应该有一整套底层维度设计，上面的应用要尽可能地重复使用这些基本的维度定义。 确定商务流程的粒度 在怎样的宏观或者微观水平上描述这个商业流程 确定定量的事实 例如，营业额就是零售业的度量。 addthis_url = 'http%3A%2F%2Fanswersy.com%2Fzchen%2F2008%2F04%2F09%2Fdata-warehouse-1%2F'; addthis_title = 'Data+Warehouse+%281%29'; addthis_pub = 'zchen050815';]]></description>
			<content:encoded><![CDATA[<p><strong>基础设施构架</strong></p>
<ul>基础设施要好用、够用。这里着重要考虑的是总的<strong>数据量</strong>，<strong>进出流量</strong>以及<strong>增长率</strong>。换句话说，数据仓库最终要装多少数据，到底会承受怎样的输入输出压力，随着时间推移总量和输入输出的压力如何变化。通常用现成的Oracle RAC，按照OLAP来配置；也可以采用免费DBMS加中间件的方式组成系统；甚至连整个存储系统都自行构建，例如采用Hadoop。显然，Oracle搭建起来比较快，但成本相对高；后者需要相当的人力资源投入，但可以掌握实际技术，灵活性高。</p>
<p><strong>原始数据清洗</strong></p>
<p>主要是过滤噪音、打标签、补足缺失部分</p>
<p><strong>数据导入</strong></p>
<p>分布、存储、索引</p>
<p><strong>数据归总</strong></p>
<p>按照商业或者分析的需求计算统计值</ul>
<p><strong>数据仓库设计</strong></p>
<ul><strong>明确商务流程</strong></p>
<p>必须对具体的商务活动本身要有深入的了解和认知。</p>
<p><strong>确认商务流程</strong><strong>中的各元素和维度</strong></p>
<p>理解系统中人、物、事件、活动以及之间的逻辑关系。确定如何用各种数据参数来描述每个元素和事件活动。好的数据仓库应该有一整套<strong>底层维度设计</strong>，上面的应用要尽<strong>可能地重复使用</strong>这些基本的维度定义。</p>
<p><strong>确定商务流程的粒度</strong></p>
<p>在怎样的宏观或者微观水平上描述这个商业流程</p>
<p><strong>确定定量的事实</strong></p>
<p>例如，营业额就是零售业的度量。</ul>
<script type="text/javascript">
  addthis_url    = 'http%3A%2F%2Fanswersy.com%2Fzchen%2F2008%2F04%2F09%2Fdata-warehouse-1%2F';
  addthis_title  = 'Data+Warehouse+%281%29';
  addthis_pub    = 'zchen050815';
</script><script type="text/javascript" src="http://s7.addthis.com/js/addthis_widget.php?v=12" ></script>
]]></content:encoded>
			<wfw:commentRss>http://answersy.com/zchen/2008/04/09/data-warehouse-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data Warehouse &amp; Decision Support System</title>
		<link>http://answersy.com/zchen/2008/04/09/data-warehouse-decision-support-system/</link>
		<comments>http://answersy.com/zchen/2008/04/09/data-warehouse-decision-support-system/#comments</comments>
		<pubDate>Wed, 09 Apr 2008 19:09:41 +0000</pubDate>
		<dc:creator>zchen</dc:creator>
				<category><![CDATA[Data]]></category>
		<category><![CDATA[IT Related]]></category>

		<guid isPermaLink="false">http://answersy.com/zchen/2008/04/09/data-warehouse-decision-support-system/</guid>
		<description><![CDATA[数据越来越受到人们的重视，数据经过加工就成为“信息”，而信息一旦得以有效利用，就是商业中的竞争优势和壁垒。在互联网时代的商业模式上，这个已经越来越不是什么秘密了。 数据的流程总不外乎以下的步骤：搜集，清理，存储，查询，分析，理解，预测，决策，反馈。 搜集的过程看上去很直接：卖东西，就记住每一笔交易的时间和金额。其实，如果能够主动地根据对整个商业流程的理解去搜集更加丰富的数据源，那最终可以得到的“信息”也会更有价值。 清理和存储是为查询和分析提供技术支持。好的技术设施平台应该在整个流程中看上去并不显眼，但是用起来够用好用。当然这个基础设施是有成本的，技术投入和资金投入常常需要平衡。 查询是手段，更是观察问题的出发点。当一个人明确地知道该去查询什么的时候，数据就开始有了生命。查询要决定看数据的角度、维度和粒度。并且在不同的粒度上把数据的统计量计算出来。 数据的生命周期是短暂的，人们通常只关心离现在不远之前的过去和不远的之后的将来。如果搜集了海量的数据而无法及时处理分析，它们就会静静地躺在那里，除了占用存储设备，毫无价值。 分析是在查询和查询建立的中间数据上建立模型，试图解释过去、预测未来。 对于历史，需要“总结”。所谓总结就是求和、求平均、找最大最小。比如我们希望对具体的客户建立模型，他的购买历史就很重要，但是你如果把他的所有交易一一纪录下来，数据很全，却没什么用！你可以按照商品的分类求次数和金额的总和，确定他关心的分类；你可以根据时间的远近加权求平均，这样能更加细致地看到他最近的变化。最终，你需要的是一个“统计上的总结”，它可以是一组数字、或者一个公式。 在分析的基础上建立模型，就可以预测未来的表现。比如你总结到一个客户最关心电子消费类产品，你向他推荐iPhone就比口红要有效得多。而模型，就会自动地把这一套逻辑实现在你提供的服务中去。 如果你忽然发现用户中大量购买某一类商品，你的整个供应链就应该得到通知去应付新的情况。 如果你看到了趋势，但大潮流还没有到来。是应该跳下水去推波助澜，还是站在岸上静静观望，这是一个商业决策者必须做出的选择。无论是数据仓库还是DSS，能够给决策者提供嗅觉的依据是最高的价值所在。 最后，再用数据证明决策或者模型的效果；功德圆满。 addthis_url = 'http%3A%2F%2Fanswersy.com%2Fzchen%2F2008%2F04%2F09%2Fdata-warehouse-decision-support-system%2F'; addthis_title = 'Data+Warehouse+%26%23038%3B+Decision+Support+System'; addthis_pub = 'zchen050815';]]></description>
			<content:encoded><![CDATA[<p><strong>数据</strong>越来越受到人们的重视，数据经过加工就成为“信息”，而信息一旦得以有效利用，就是商业中的竞争优势和壁垒。在互联网时代的商业模式上，这个已经越来越不是什么秘密了。</p>
<p>数据的流程总不外乎以下的步骤：搜集，清理，存储，查询，分析，理解，预测，决策，反馈。</p>
<p><strong>搜集</strong>的过程看上去很直接：卖东西，就记住每一笔交易的时间和金额。其实，如果能够主动地根据对整个商业流程的理解去搜集更加丰富的数据源，那最终可以得到的“信息”也会更有价值。</p>
<p><strong>清理</strong>和<strong>存储</strong>是为查询和分析提供技术支持。好的技术设施平台应该在整个流程中看上去并不显眼，但是用起来够用好用。当然这个基础设施是有成本的，技术投入和资金投入常常需要平衡。</p>
<p><strong>查询</strong>是手段，更是观察问题的出发点。当一个人明确地知道该去查询什么的时候，数据就开始有了生命。查询要决定看数据的角度、维度和粒度。并且在不同的粒度上把数据的统计量计算出来。</p>
<p>数据的生命周期是短暂的，人们通常只关心离现在不远之前的过去和不远的之后的将来。如果搜集了海量的数据而无法及时处理分析，它们就会静静地躺在那里，除了占用存储设备，毫无价值。</p>
<p><strong>分析</strong>是在查询和查询建立的中间数据上建立模型，试图解释过去、预测未来。</p>
<p>对于历史，需要“总结”。所谓总结就是求和、求平均、找最大最小。比如我们希望对具体的客户建立模型，他的购买历史就很重要，但是你如果把他的所有交易一一纪录下来，数据很全，却没什么用！你可以按照商品的分类求次数和金额的总和，确定他关心的分类；你可以根据时间的远近加权求平均，这样能更加细致地看到他最近的变化。最终，你需要的是一个“统计上的总结”，它可以是一组数字、或者一个公式。</p>
<p>在分析的基础上建立模型，就可以<strong>预测</strong>未来的表现。比如你总结到一个客户最关心电子消费类产品，你向他推荐iPhone就比口红要有效得多。而模型，就会自动地把这一套逻辑实现在你提供的服务中去。</p>
<p>如果你忽然发现用户中大量购买某一类商品，你的整个供应链就应该得到通知去应付新的情况。</p>
<p>如果你看到了趋势，但大潮流还没有到来。是应该跳下水去推波助澜，还是站在岸上静静观望，这是一个商业<strong>决策</strong>者必须做出的选择。无论是数据仓库还是DSS，能够给决策者提供嗅觉的依据是最高的价值所在。</p>
<p>最后，再用数据证明决策或者模型的效果；功德圆满。</p>
<script type="text/javascript">
  addthis_url    = 'http%3A%2F%2Fanswersy.com%2Fzchen%2F2008%2F04%2F09%2Fdata-warehouse-decision-support-system%2F';
  addthis_title  = 'Data+Warehouse+%26%23038%3B+Decision+Support+System';
  addthis_pub    = 'zchen050815';
</script><script type="text/javascript" src="http://s7.addthis.com/js/addthis_widget.php?v=12" ></script>
]]></content:encoded>
			<wfw:commentRss>http://answersy.com/zchen/2008/04/09/data-warehouse-decision-support-system/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.434 seconds -->

