It's in Didcot and it's running on open source
By Steve Ranger
Published: 24 November 2005 09:00 GMT
The mysteries of dark matter, multiple dimensions and even the conditions following the Big Bang could be solved with the help of the world's biggest computer grid. And a big chunk of it is being built in sleepy Oxfordshire.
The Large Hadron Collider (LHC) being constructed at CERN near Geneva will be the largest scientific instrument on the planet and will need the hugely powerful computing to process the 15 Petabytes of data that it will produce each year.
The LHC will smash protons and ions into head-on collisions to help scientists understand the structure of matter.
Discovering new types of particles can only be done by statistical analysis of the massive amounts of data the experiments will generate - which is where the LHC Computing Grid project comes in.
And although the LHC won't be up and running until 2007, work has already begun on the grid, with the UK being one of the largest contributors.
Of the 150 grid sites around the world, 18 are in the UK. And much of the UK work is being done at the Rutherford Appleton Laboratory (RAL) in Oxfordshire. See pictures here.
Because of the scale of processing needed, grid is the best way to go, explained John Gordon deputy director of the Council for the Central Laboratory of the Research Councils e-Science Centre at RAL. "The computing has been planned for years; we've been looking at distributed computing for a long time," said Gordon.
He added: "They couldn't afford to do all the computing at CERN so we knew we would have a big distributed computing problem of sifting the data around the world and finding it again. It's the biggest production grid in the world."
The grid will use a four-tier model - data will be stored on tape at CERN, the 'Tier-0' centre. From there, data will be distributed to Tier-1 sites which have the storage and processing capacity to cope with a chunk of the data. These sites make the data available to the Tier-2s, which are able to run particular tasks. Individual scientists can then access data from Tier-3 sites which could be local clusters or individual PCs.
RAL hosts the UK's Tier-1 site, with the universities of Lancaster and Edinburgh and Imperial College operating Tier-2 sites.
And while real data won't start flowing until 2007, scientists are already using lots of processing power on simulations. Gordon said: "They need to know what they are looking for so they do lots of simulations."
Commodity hardware and open source software are being used to keep costs down; "because it's worldwide we are all looking at open source", said Gordon. "All the grid stuff is done in open source, that's taken for granted. Grid should use standard protocols, it's across administrative domains."
Network bandwidth will also be key - at the moment it has a 2Gbps dedicated link to CERN - the same amount of bandwidth RAL uses for all the rest of its internet traffic, and the plan is to build a dedicated fibre-optic network between the sites.
Gordon said: "What we are looking at is setting up a network of private light-paths to Tier-1 sites."
Managing the huge number of files the experiments will generate is another problem the team is working on, according to Gordon: "You end up with millions of files and the problem comes in handling them and that's where the data management comes in. Data management is key."
But beyond all the exciting technology, much of the work will be in persuading different organisations to share. Gordon said: "A lot of it is sociological - you are persuading people that they gain by connecting all their computers together. It's about collaboration; it's not about people sitting in London using computers all over the world, it's about groups of people working on the same problem."
Grid/Parallel/Distributed Computing Salary: - Up to 30,000 (subject to experience) How to Apply Please send your CV and Cover Letter to the Careers ...
It employs a grid-distributed, fault tolerant architecture capable of supporting all asset classes. Main duties include full development lifecycle ...
Due to the complex algorithms you must have first class educational background and additional knowledge in some of the following: Natural Language ...
Agenda Setters 2009
Welcome to the ninth annual Agenda Setters poll – silicon.com's list of the top 50 most influential individuals in the technology and IT industries, from techies and CIOs to entrepreneurs and business leaders. Find out more in our latest special report.
Stories from the web...
Copyright © 2008 CBS Interactive Limited. All rights reserved. Top of page
Nick Heath
Let's shine a light into the public sector IT money pit
With £16bn being spent, why is productivity still falling?
Tim Ferguson
BBC is taking tech seriously, so give it a break!
Auntie is the envy of the world but doesn't get the credit it deserves at home...
Peter Cochrane
Peter Cochrane's Blog: Open info for all?
Government stonewalling citizens
Nick Heath
Home Office CIO on taming tech and why ID cards are good news
Interview: Annette Vernon, Home Office CIO
Nick Heath
NHS records, Google and Microsoft: Where do you want your data?
Politicians: Heal thyself
Alan Hunt
NHS network: Time to get secure
Patient data in need of a check up