tag:blogger.com,1999:blog-8556003786324804973.post2805549385271720821..comments2024-01-27T03:43:02.667-08:00Comments on HDFS: Hadoop Research TopicsDhruba Borthakurhttp://www.blogger.com/profile/10832366855372649190noreply@blogger.comBlogger44125tag:blogger.com,1999:blog-8556003786324804973.post-76813673704854946072014-06-18T11:38:13.807-07:002014-06-18T11:38:13.807-07:00ha ha....Every software engineer wants to be on ha...ha ha....Every software engineer wants to be on hadoop. When talking to these dumb technical managers even they keep saying future will be cloud, hadoop ...blah..blah...blah.... This seriously looks like that hype which we saw in 2001-2002 on EJB. For people whose business data is completely structured why do they have to go to Hadoop? I completely understand potential of Hadoop and non-sql database. The problem area which it addresses hardly 5% of the industry might be in it. A simple client-server request-response model is what is used everywhere.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-31574538586361833532014-05-14T23:41:08.886-07:002014-05-14T23:41:08.886-07:00Hi @Dhruba i want some project ideas for developin...Hi @Dhruba i want some project ideas for developing projects. Can you please share them. Thanks in advance.Anonymoushttps://www.blogger.com/profile/06696916473328207244noreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-65459395912319631422014-05-02T08:07:04.072-07:002014-05-02T08:07:04.072-07:00if u fund smthing plz tell me
if u fund smthing plz tell me<br />Anonymoushttps://www.blogger.com/profile/00850271660244631004noreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-11206749056315121602014-01-22T09:40:35.979-08:002014-01-22T09:40:35.979-08:00It really very use full guidance for any of Resear...It really very use full guidance for any of Researcher or Hadoop Developer. Thank You <br /><br />Actually i want work on hadoop security as my M.phil Research theses. I read many papers,articles lots work already done in this area can you plz suggest me any track on this area. Now I have only 1.5 Month of time for research work.<br /><br />Can you Plz Help me...Gaurav Asharanoreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-76655345728092841672013-07-07T06:36:12.334-07:002013-07-07T06:36:12.334-07:00great post. for beginners do follow
http://thehado...great post. for beginners do follow<br />http://thehadooptutorial.blogspot.comAnonymoushttps://www.blogger.com/profile/11586461345366145343noreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-40262882293602788242013-03-31T23:13:47.398-07:002013-03-31T23:13:47.398-07:00hello, you blog is a full of complate Hadoop Techn...hello, you blog is a full of complate <a href="http://www.oodlestechnologies.com/" rel="nofollow">Hadoop Technology </a> information. it is very Excellent blog ilke it.Oodles Technologieshttps://www.blogger.com/profile/15112856944080584479noreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-86805362991399206702013-03-27T12:39:57.879-07:002013-03-27T12:39:57.879-07:00I want to know if hdfs can tolerate byzantine faul...I want to know if hdfs can tolerate byzantine faults?Anonymoushttps://www.blogger.com/profile/09336044144583958564noreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-70072332666380552013-01-15T19:03:36.746-08:002013-01-15T19:03:36.746-08:00to attend the hadoop summit...to attend the hadoop summit...Madhavinoreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-62294638660700780422013-01-11T20:52:07.040-08:002013-01-11T20:52:07.040-08:00Hi,
I have worked on Dynamic replication in our s...Hi,<br /><br />I have worked on Dynamic replication in our semester project. I wanted to submit that as a patch. I have replied to the original jira-782. But didn't get any reply. May I know what is the best way to communicate with hadoop developers?<br /><br />Thank you,<br />OmkarOVJhttps://www.blogger.com/profile/07437949654117950413noreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-73498001357570827972013-01-08T00:38:49.355-08:002013-01-08T00:38:49.355-08:00Wealth of information. Ton of ideas...nice blogWealth of information. Ton of ideas...nice blogRaja Thiruvathuruhttps://www.blogger.com/profile/12284823676835264792noreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-23124618262319844882012-10-20T02:21:37.062-07:002012-10-20T02:21:37.062-07:00This is an excellent blog.
Facebook Applications ...This is an excellent blog.<br /><br /><a href="http://www.inquitech.com" rel="nofollow">Facebook Applications Starting $39.99 ONLY!</a>Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-28230115027006923982012-09-11T04:11:34.960-07:002012-09-11T04:11:34.960-07:00Hi, Dhruba
I am currently working on Hadoop Projec...Hi, Dhruba<br />I am currently working on Hadoop Project as my M.Tech Study, Inspired from your Blog.<br />My current Project is "Accelerate Hadoop performance using Distributed Cache".<br /><br />I am currently studying Hadoop code, but not getting the exact entry point that where I must add Caching code of mine so that Intermediate results will get cached and obviously performance can be increased.<br /><br />Will you please suggest which classes need to modify.<br /><br />Thanks Walchand M.Tech I 2011https://www.blogger.com/profile/01417326307236823692noreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-34507677352598828722012-07-27T10:24:00.152-07:002012-07-27T10:24:00.152-07:00Hi, Dhruba
I am post graduate student.I am beginn...Hi, Dhruba<br /><br />I am post graduate student.I am beginner in Hadoop and Mapreduce programming. I am pursuing post graduation in computer science and engg.I have selected load balancing in cloud as my major project. I have gone through http://wiki.apache.org/hadoop/EclipseEnvironment for hadoop development using eclipse.But getting problem. I am all the steps they mentioned.So if there some other way for hadoop development.shaileshnoreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-71133467870510118362011-12-01T05:49:48.860-08:002011-12-01T05:49:48.860-08:00Thanks a lot Dhruba. Even wanted to ask to how muc...Thanks a lot Dhruba. Even wanted to ask to how much level the security issues are handled. If the system is distributed in nature then with the help of replication the data is maintained, the data from the failed node can be recovered, pls throw some light on security measures on such nodes. what research can be done in this area?<br />Even wanted your elaboration on research areas you suggested :-<br />Ability to make a map-reduce job take new input splits even after a map-reduce job has already started.Madhavinoreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-63209383503669975212011-11-30T11:16:12.743-08:002011-11-30T11:16:12.743-08:00@Madhavi: Hadoop is by definition "fault tole...@Madhavi: Hadoop is by definition "fault tolerant". But one angle of research is to create a standard benchmark to measure "fault-tolerance". Then you can run it on verios versions of hadoop to figure out if its fault-tolerance is increasing or not.Dhruba Borthakurhttps://www.blogger.com/profile/10832366855372649190noreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-1400017082935426142011-11-30T06:29:42.950-08:002011-11-30T06:29:42.950-08:00Hello Dhruba,
this is what you said earlier,there ...Hello Dhruba,<br />this is what you said earlier,there is not much work/research on hadoop across data centers, but when we talk about the data centers then they can be the nodes distributed in nature, then such type of research must have been done before, Isn't it?<br />Is it possible to study fault tolerant nature of Hadoop as a research topic? what else can be there?<br />your valuable suggestions are required on the sameMadhavinoreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-55300440138978441632011-10-25T10:17:53.619-07:002011-10-25T10:17:53.619-07:00Hi, Dhruba
I have a question about the 2nd in th...Hi, Dhruba <br /><br />I have a question about the 2nd in the list. Is it a incremental computing problem? Can Percolator be used to solve this?Wei Yanhttps://www.blogger.com/profile/05440220710324072245noreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-30584104983414882032011-10-22T00:26:19.530-07:002011-10-22T00:26:19.530-07:00@amindri: there is not much work/research on hadoo...@amindri: there is not much work/research on hadoop across data centers. I tried something called HighTide (https://issues.apache.org/jira/browse/HDFS-1432) and you can find the code at (https://github.com/facebook/hadoop-20/tree/master/src/hdfs/org/apache/hadoop/hdfs/server/hightidenode)<br /><br />But HighTide is not yet in production in any of our clusters.Dhruba Borthakurhttps://www.blogger.com/profile/10832366855372649190noreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-82447074069515029972011-10-17T18:59:39.136-07:002011-10-17T18:59:39.136-07:00Hi Dhurba
I need to carry out a research for one y...Hi Dhurba<br />I need to carry out a research for one year for my undergraduate program. <br />I'm interested about "Make map-reduce jobs work across data centers."<br /><br />Has the hadoop community started ay work on this topic?<br />Also I need to perform a literature review on the topic from existing papers before commencing on the research. But I couldn't find much research papers regarding this topic. Any of your advices would be very valuable.<br />thanks :)amindrihttps://www.blogger.com/profile/10083836911506371472noreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-90511329951498889322011-09-14T21:15:18.140-07:002011-09-14T21:15:18.140-07:00@Bharath: I do not think anybody is attempting to ...@Bharath: I do not think anybody is attempting to do it. But there is a real need to do realtime analytics!Dhruba Borthakurhttps://www.blogger.com/profile/10832366855372649190noreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-82352911370110887842011-09-14T19:44:57.259-07:002011-09-14T19:44:57.259-07:00@Dhruba, I see what you're saying now: this co...@Dhruba, I see what you're saying now: this could essentially provide "streaming input" capability to MR. Sounds cool!<br /><br />Any pointers on whether this is being attempted somewhere? From what I gathered from the community, it's not.<br /><br />Thanks for the ideas!Bharath Ravihttps://www.blogger.com/profile/06639494053456468708noreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-61439180896882410592011-09-14T18:13:03.764-07:002011-09-14T18:13:03.764-07:00@Bharath: if you want to do realtime analytics, th...@Bharath: if you want to do realtime analytics, then new data is arriving even after the map-reduce job has been submitted. Currently, we will wait for the current mr job to finish and then submit a new job. Instead, it will be nice if we can add new inputsplits to the currently running mr jobs.. the end results will be available earlierDhruba Borthakurhttps://www.blogger.com/profile/10832366855372649190noreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-33246866574431716672011-09-14T17:59:58.557-07:002011-09-14T17:59:58.557-07:00Hi Dhruba,
I just came across this list, as I wa...Hi Dhruba, <br /><br />I just came across this list, as I was scanning for ideas on Hadoop to work on. I'm a newcomer to Hadoop, and am generally interested in improving scalability and "flexibility": the second point in the list especially interests me.<br /><br />Is dynamic appending of input a requirement in real-world use cases? I was wondering why this is needed, when the extra input could become a second mapreduce job.Bharath Ravihttps://www.blogger.com/profile/06639494053456468708noreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-73835340500648626452011-05-12T11:18:15.103-07:002011-05-12T11:18:15.103-07:00@Sridhar: if u add machines to a running cluster, ...@Sridhar: if u add machines to a running cluster, they will automatically be used to run tasks of currently running jobs.Dhruba Borthakurhttps://www.blogger.com/profile/10832366855372649190noreply@blogger.comtag:blogger.com,1999:blog-8556003786324804973.post-38556850410617160552011-05-12T08:33:02.040-07:002011-05-12T08:33:02.040-07:00Dhruba,
Does it make sense to add slave nodes to ...Dhruba,<br /><br />Does it make sense to add slave nodes to a hadoop clustur running a job? Will the slave be able to participate in the current job?<br /><br />Thanks.<br /><br />ShridharShridharhttps://www.blogger.com/profile/09212684093825226348noreply@blogger.com