+ Reply to Thread
Results 1 to 17 of 17

Thread: AMD Llano

  1. #1
    The Special Little Boy. Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security's Avatar
    Join Date
    Jul 2007
    Location
    The Netherlands
    Age
    22
    Posts
    9,632

    Awards Showcase

    Downloads
    7
    Uploads
    1
    Blog Entries
    3
    vCash
    1500
    Rep Power
    267

    Default AMD Llano

    I came across a nice interview about AMD's upcoming Llano and thought I should share.


    AMD Llano - The First Accelerated Processing Unit
    Source

    AMD has started promoting its new Fusion concept, which, among other things, encompasses central and graphics processor on the same piece of silicon. This approach is significantly different from the existing integration of the competitive company, and we came in contact with Samuel Naffziger, AMD Senior Fellow, who explained how they thought this whole system and which are the important innovations comparing to the current approach.

    IHW: So what was the exact name? Llano, if we understand correctly?
    Samuel Naffziger: Yes, it's Llano, with double "L". It is the codename for our first APU (accelerated processing unit), which is the description they're using for our first products that integrate CPU and GPU on a single piece of silicon. It's actually four CPU cores on the first APU product. Here we talk about new era of processor performance, and the message here is at the same time complex and very simple. If you look at the history of architectures, specificaly CPU going back to the single core, and how advancements were made to performance, it was very much driven by frequency increase. Over time that has come less and less effective as power became more of a concern. Then we moved into a multi core era which is live and kicking today and is pretty much the central AMD CPU strategy, but even then it is increasingly requiering additional processing performance to handle some of the more popular applications which are more consuming. Even business suites use more and more graphical type of applications in their busineses. That's kind of the very simplistic high level of view of how we landed on direction which we took with Llano and APUs, and how we bring in the more paralel type of processing alongside the CPU to handle also something that is not strictly dedicated to graphics, where programmers can use the paralel architecture for more traditional CPU sequential tasks.



    IHW: Ok, so what about the power consumption? As we can see from the promotional slides, power consumption is 2.5 and 25 W, it's a quite difference between minimum and maximum and also it's quite low comparing to CPUs that are in use right now.
    SN: For one, the 32nm technology provides power efficiency which helps drive the power consumption down, and this is a mobile focus design, so we enthisise low voltage operation. As you may know these voltages are one of the most efficienly in terms of improving the performance per wat. The other thing is this is the CPU core power consumption, it's not the entire APU. And the range of usage of this APU core is extremely broad, so it's the key feature providing great response time, good consumer experience when the performance is needed, and when it's not needed, it can deliver the good battery life


    IHW: For example, right now the HD 5000 series, their best is about 2.2 billions of transistors per chip. It's much much more than 35 milions of transistors in Llano core, so can you share with us exactly what number of transistors will the end core have?
    SN: We're not disclosing details of the chip level right now.

    IHW: You are aware that Intel already introduced the CPU and the GPU on the same socket, and they managed to put two different manufacturing processes in the same package. Do you care to explain what is the difference in your aproach and their aproach?
    SN: What we're doing here is very different than Intel's aproach. They're trying the Larabee path and and essentialy they're validating our approach, but what they have done is just a low-end integrated graphics processor slapped in a multi chip module. That's very different of what we're doing with the Llano APU where we are taking industru leading graphics technology and fully integrating it with the CPU. Our solution shares the same memory subsystem and the integrated bus, which have significant advantages over just a low-end IGP attached in the CPU module.

    IHW: When you say industry leading GPU technology is used in Llano, do you mean that you will be using the same Stream technology which exists right now in DirectX 11 discrete accelerators, or you will take some other approach?
    SN: Well, it's the DirectX 11 capable GPU, and that's one of the largest differences, but we're not getting into the specifics of its capability versus what Intel's integrating in its CPU, but suffice to say it's fully capable DirectX 11 GPU which is different from what is integrated in Core i3 and Core i5 products




    IHW: Let's go back to power management, what is the largest advancement in Llano?
    SN: As you can imagine the level of integration the four cores with advanced graphics processor here, all in a mobile form factor, it requires a very advanced power management. One of those types of managing this kind of paralel processing environment is to be able to power gate any unit that's not needed at the particular time. And what we're done here is exploit some of the unique features of our process technology, Silicon on Insulator (SOI), and enabling efficient power gating approach on this x86 cores. So when the workload requires one, two or three cores and not all four, the unused cores burn a neglible amount of power. The goal of this is that we provide the best performance per wat across the broad range of consumer workload.

    IHW: When you talk about SOI technology, and at the same time about 32nm technology with immersional litography, what exactly are the benefits when you have this two technologies at the same time on this manufacturing process size?
    SN: The aspect of SOI here, that works synergisticly with our power gating, is that there is no source stream junction substrate. So instead of gating the ground, in the bulk technology power has to be gated. But we can just flood the ground and if there's no dyas connected to substrates, there is no leakage and it becomes a lot more efficient system.

    IHW: What about the other innovations in Llano?
    SN: The second inovation here is that we are enabling very efficient chip-level power management approach by using a digital power-meter integrated in each one of this cores. So in a nutshell, we sampled nearly one hundred signals until we determined core-level power consumption. And it real-time monitors the core operation and execution and we have maintained a running number of power consumption at the chip level which can be used for a variety of power management functions. So we found that this digital aproach not only provides better accuracy, so that provides full repeatability as the subject of environmental variations, that comes along with other aproaches used in industry that involves temperature sensors and AV-meters.

    IHW: I don't know if you can discuss this, but when you said that the power management is important to you, for example there is one or more cores that is not in use at the moment. How much power will they consume in this lowest power state?
    SN: If they're power gated? Well all we're saying right now is, less than 10% of the leakage power. So if you're familiar with the litography processor design, the leakage is typicaly 20-30% of the total power consumption. Our implementation is that the leakage component by power gating reduced by factor of 10.

    IHW: When you are saying that GPU will be integrated in the same architecture with the CPU, will there be some overlapping between GPU and CPU functionality, will they share for example the same memory controller or it will be different, will the clocks be different, e.g. different frequency domains?
    SN: Well, some of your questions require more details than we're ready, but we're making it clear that in this synergy the GPU and the CPU share the same memory subsystem and that provides many levels of improvement for both components and they also have a high speed on-die bus for improved communications between CPU and GPU.





    IHW: Llano x86 core is basically a 32nm version of Phenom II core, if we heard correct. So is it a straight-forward shrink, or are there some significant changes when we are talking about new core?
    SN: It's not a rebranded core. I wouldn't say it's significant architectural overhaul but there are basicaly some feature tweaks that address performance oportunity, so it will be a higher performing version of the legacy core, but it's the same fundamentals.

    IHW: When you said about immersion litography, we know that right now there are some problems about yield levels at your TSMC partner on 40nm GPU side. What kind of yield level can we expect from Llano platform?
    SN: Well, TSMC is not the manufacturing company for Llano, it's Global Foundries, but obviosly our expectations are very high, I mean that's the critical part of what we're doing here so you are acknoledging the challenges we're bringing, and of the GPU which was previosly made in bulk process by the company, and the SOI process by the different fab, but what we're working now is confident, Global Foundries will deliver it's promise, that's really all I can say at this point.

    IHW: When you are saying that it's different type of manufacturing, that it's a bulk process, on the CPU side that is completely different, what will you do about GPU side? Will you still use the bulk technology, or you will somehow merge it with SOI technology?
    SN: Well that's not feasible, there's only one way because it's a single die, it's a single chip. Discrete equivalent of the DX11 GPU will continue to be produced in bulk technology. We're not moving the discrete GPUs to SOI. But for Llano, it's a single integrated die so that GPU is being produced on the SOI process.

    IHW: Well if you can just clarify something for us, when we are talking about 32nm manufacturing process, the aproach that AMD has and Intel has, can you just give us the highlights of biggest differences between these two aproaches?
    SN: Beyond the obvious SOI versus bulk technology, we weel we have additional ground of experience because of our second generation of immersion lithography process, they are different, but that is area of expertise of Global Foundries.

    IHW: Can you maybe reach some higher frequencies than your competitors?
    SN: The frequencies acheved are very specific design driven, so I thing it's efficient to say we believe we have better processors and more efficient in this Global Foundry technologies than anyone elses technologies.


    Breaking news:
    Clouds are God's sneezes.

  2. The Following 2 Users Say Thank You to Security For This Useful Post:

    ThE NaMeLeSs (24-02-2010), Zettlerin (23-02-2010)

  3. #2
    Trust Me, I'm A Doctor Sean has a spectacular aura about Sean has a spectacular aura about Sean has a spectacular aura about Sean has a spectacular aura about Sean has a spectacular aura about Sean has a spectacular aura about Sean has a spectacular aura about Sean has a spectacular aura about Sean has a spectacular aura about Sean has a spectacular aura about Sean has a spectacular aura about Sean's Avatar
    Join Date
    Jun 2006
    Location
    Manchester
    Age
    23
    Posts
    10,745

    Awards Showcase

    Downloads
    3
    Uploads
    0
    vCash
    1100
    Rep Power
    627

    Default

    Dates and prices?

  4. #3
    The Special Little Boy. Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security's Avatar
    Join Date
    Jul 2007
    Location
    The Netherlands
    Age
    22
    Posts
    9,632

    Awards Showcase

    Downloads
    7
    Uploads
    1
    Blog Entries
    3
    vCash
    1500
    Rep Power
    267

    Default

    Rick Bergman: Fusion Does Not Mean Death To The Graphics Card Market
    Source

    Last year's agreement between Intel and AMD has caused a lot of attention on the IT scene. With the history of cooperation and conflict between these companies longs for decades, many were interested in the kind of agreements reached by the companies. We spoke to Rick Bergman, Senior Vice President and General Manager of AMD, who gave us many interesting informations.
    InsideHW: Mr Bergman, our readers are especially interested in the information concerning last year’s “agreement” between Intel and AMD. Could you disclose a bit more on the topic?

    Rick Bergman: You’re referring to the settlement?

    IHW: Yes. Why opt for a settlement? Did you estimate that the trial would take too long, or that the court-assigned compensation in case of a court verdict in your favour would be lower than the offered one? Perhaps it was that the non-financial part of the AMD settlement was more interesting than the money itself?
    RB: The negotiations reached a point at one moment where we felt they fulfilled our goals. There are many details that relate to that. Basically, we wanted to provide a fair market competition, taking into consideration patent licences as well as manufacturer ones, in order to better determine the “game rules”. We were satisfied with the results achieved, and of course, securing the compensation for losses caused to AMD by the previous business strategy. It was our estimate that the time was right for such a move.
    IHW: When you say patent contract, are you referring to the right to the production of x86-compatible CPUs, or is that not part of this contract?
    RB: There are two parts to that contract. The first part involves the exchange of licence rights between AMD and Intel, whereas the second part is Intel giving the licence rights to Global Foundries. Both parts are very important to AMD, since they enable us to be a company completely free of its own factories, while providing Global Foundries an opportunity to be an independent manufacturer of a wide semiconductor gamma.

    IHW: Many rumours sparked on the internet saying that the compilers provided by Intel for its CPUs are created in such a way that they provide much worse performance on non-Intel CPUs. Does this deal concern that topic as well?
    RB: There are no exact details on that in the settlement, since it only concerns business dealings in general. Nothing exact has been agreed upon as far as compilers are concerned. We know that it’s a part of the FTC charges against Intel, which I cannot comment upon at the moment.

    IHW: But you will basically have to provide proof on the subject to the FTC committee if requested to?
    RB: Should FTC send a warrant, we will certainly provide any info requested. We do business in full accordance with the law, and will therefore forward any information if requested to by FTC or the EU committee.



    IHW: Intel currently has some very interesting products, both in desktop and portable segments. For now at least, their approach seems to be that of CPU and GPU unification on one socket, not one chip, which seems to be AMD’s approach towards the future Fusion platform. As far as we know, Fusion should appear on the market next year (2011). What exactly is the difference between the approaches to Westmere and Fusion projects?
    RB: The differences are obvious. Unlike Westmere, our Fusion project will fully use APU (Accelerated Processing Unit) processing units which will be situated on the same silicon cradle. Owing to that, we will be able to provide an acceptable ratio of power and consumption. Westmere is a compromising solution which is not that different from the situation where the GPU is integrated on the motherboard. Our strategy is to create a full scale of single-chip products which will offer high performance with an acceptable consumption. It will be quite different from anything Intel has done so far.

    IHW: According to preliminary tests, it seems that the performance improvement Intel managed to achieve is around 50%, compared to the previous solution. On the other hand, it is clear the AMD approach requires an entirely different manufacturing process, and since the GPUs of today offer very high performance, will Fusion have anything to offer in the mid, perhaps even high class as far as GPU performance goes?
    RB: There are all sorts of synthetic tests, but we are interested in the performance gains we are able to achieve in actual applications. In fact, our goal is to achieve high performance in that segment exclusively. On the other hand, there is that moment when the financial side ceases to matter, especially with powerful workstations and enthusiasts. In such cases, the GPU effect is the thing that matters, which means that an independent graphics card will still be needed, offering higher performance. Naturally, we’re planning separate CPU and GPU products as well, since separate graphics cards will still be necessary to achieve maximum performance.

    IHW: That’s good news, at least as far as companies in the GPU business are concerned. Could you comment on the delay of the Larabee project, since AMD predicted on multiple occasions that Intel would have problems with that, and thus be unable to develop Larabee in the intended period? According to Intel, instead of becoming available to end users, Larabee is phasing into development platform. Do you expect Larabee to be developed further or was it a dead end?
    RB: I was working in ATI when it was bought by AMD, and even before that on the development of x86-compatible CPUs. About twenty years ago, there was the common opinion that a GPU was easy to make, which may’ve been true until 10-15 years ago. At the time, CPU design was much more complex. As years went by, GPUs became very complex and very powerful processors indeed, able to do many things. Therefore, it’s no surprise that Intel “tripped” in the attempt to combine the two. Other than the fact that neither a good CPU nor a good GPU are easy to create, they are also drastically different in design, and a fundamentally different way of thinking is needed for a successful project of that sort.

    IHW: Talking about GPU manufacturing, I remember AMD around the launch of 4000 series of graphics cards making comments on Nvidia’s ways and expectations that they will switch to multi-core architecture with the next generation already, unlike the single-core architecture working on higher clocks. It’s easy to draw a parallel between that and the current situation in the CPU field, since four years ago, frequency was the main benchmark of performance, but it seems that we’ve been stuck with CPUs working at around 3 GHz for some time, only with more and more cores. The best graphics cards on the market at the moment also have multiple (two) cores with a lower clock. Nvidia Fermi isn’t out on the market as of yet, but they’re advertising it as something which would foreshadow everything else by its performance. Which approach do you suspect they will be taking, one massive chip or a multi-core solution?
    RB: I believe we’ll have to wait a bit more for Nvidia to come forward with its final solution. AMD has had a clear strategy in years, which we like to call “sweet spot”. First we bring out a product which has a fantastic ratio of price and performance, and then we place two of those on a single board for maximum performance. If you attempt something like that with a larger and more powerful GPU, it’ll be difficult, you’ll be late, and so will your cards. That’s the reason that, at the moment when Nvidia still has no DirectX 11-compatible cards available, we’re selling millions of ours which fulfil the standard. Working frequency is unimportant with GPUs. With CPUs, who perform calculations serially, and where lower latencies are a critical benefit, clocks matter. As GPUs are more oriented towards parallel computing, multiple cores at a lower frequency do a great job. I’d also add that our products don’t function at lower clocks compared to the solutions offered by our “friends” in Nvidia. They have a smaller part of the chip working at a higher clock, while the rest of it works at a lower one. Our GPUs work at a unified frequency, as a whole, at an intermediate clock value, so to speak, unlike them, who have some parts at a higher and some parts at a lower frequency. If you take a look at and compare our products, you’ll see that “performance per watt” and “performance per dollar” are on our side.

    IHW: Nvidia is currently leading in 3D display support in games and they seem to be focused on that technology. On our way here, we’ve seen announcements by AMD concerning support for 3D display of Blu-Ray films and their acceleration. Are there any plans in the making for the field of 3D display in games?
    RB: We will, of course, fully support 3D display technology in games as soon as an established standard appears, being that we’re already cooperating with multiple companies in that field. However, we don’t think that there is a large demand for such products on the market at the moment. I believe that Eyefinity and buying three monitors present a much better investment than the same amount of money required for a pair of 3D display glasses and a special 3D monitor.

    IHW: We agree completely that an investment required for 3D Vision is quite costly at the moment. On the other hand, we were particularly impressed by the demonstration of Eyefinity on eight Samsung monitors and the feeling created while watching BattleForge run on them. It is especially clear that most users still aren’t aware of just how much Eyefinity as a multi-screen technology is able to increase their productivity. Still, talking about gamers, we’re not sure how many of them have actual space needed to put three monitors on a single desk, as such an environment definitely needs some space. What do you think about that?
    RB: As far as that’s concerned, the rule is simple: seeing is believing. Of course it’s a challenge to “make” home users opt for such a thing, since they probably haven’t had the chance to see how it works. Journalists and people from the industry itself have a much easier time doing that, as IT fairs such as CES offer an opportunity to actually try out such things. In this internet-oriented age, it is a tough task to reach the ordinary users with such radical ideas, but that can be overcome as well. We organised a presentation during a massive gamer conference and the gamer reactions were most positive. It’s also important to say that we’ve reached a stage where one can buy quite a few different models of HD monitors and fit them all in nicely, thus creating a nice three-monitor system for a mere 400$, which is a revelatory experience in games such as DiRT 2.

    IHW: Ever since the launch of 2000 series, we’ve been hearing about tessellation, but up to now, that technology has only been used in a tech demo, while we’re still waiting for games using the tessellation method. Unfortunately, when the demo was actually run, we noticed a major drop in performance when tessellation is on, with just the explanation that the drop is dependent on the implementation, i.e. the quantity of calculations needed for the tessellation process. We had the impression that tessellation will be very important for the visual quality of upcoming games, but that the performance drop wouldn’t be as drastic. Is that possible with the current generation of DirectX 11 GPUs?
    RB: Tessellation is a postprocessing effect which should depend on the implementation, but my technical knowledge on the subject is not on the required level for me to give you a clear answer. It’s a good standard that’s been used on the Xbox 360 for years and will only advance and be more and more used in the future, now that it’s an integral part of DirectX 11


    IHW: Intel’s plans for this year are rather interesting, especially since we’re expecting their first six-core CPU for the desktop platform in the following couple of months. What’s your response to this move?
    RB: We are very glad that we were the first to have presented a native six-core CPU. It’s obvious that we’re transferring our experience from the server segment down to the desktop segment, since more and more multi-threaded applications appear there as well, thus making good use of the extra cores. Our own six-core CPU for desktop platforms should appear on the market in the first half of the year.

    IHW: How much does it take for an average developer to implement multi-core support in their application? We’ve had multi-core CPUs for five years now, but most apps are still only optimised for single-core CPUs. Are software modifications for parallel processing really that hard to do or are application developers simply too lazy?
    RB: That question is probably better off directed at the guys doing the actual programming. I really wouldn’t say that laziness is the issue, but that the problem of programming for multi-core processors actually is quite hard and being gradually resolved since decades ago. Over time, tools have progressed just as well and the task at hand is easier nowadays, but that doesn’t make it smooth. Graphics is much easier to implement multi-core processing in and recent driver revisions certainly show progress in that field as well.

    IHW: We are very positive about the improvements brought by DirectX 11, as well as Windows 7, which is indeed a great OS. The problem is that, at the moment, not even DirectX 10 is exactly fully used by developers, and 10.1 fares even worse. Do you estimate that DirectX 11 would score a bigger success than DirectX 10 and 10.1 did?
    RB: Definitely, titles boasting DirectX 11 support are appearing at a much faster rate than they did back when DirectX 10 appeared. The problem of DirectX 10 was that it was Windows Vista-bound, while DirectX 11 is supported by both Windows 7 and Windows Vista, which significantly increases the user base, which is the number one factor for developers. Another thing is that, when switching from XP and its DirectX 9 to Vista and its DirectX 10, users experienced a significant drop in performance. Now, the situation is the opposite – by switching to the new OS and the new version of DirectX, users are experiencing a performance increase. Being that there are already around 20 DirectX titles slated for this year, I expect DirectX 11 and Windows 7 to be far more successful than their predecessors.

    IHW: At one moment, it seemed that PC as a gaming platform was fading away, but the future looks brighter for PC gaming now.
    RB: Objectively speaking, there are fantastic gaming consoles on the market, but it’s the PC gaming market that gets the industry going. There are always enthusiasts for the latest hardware and newest technologies to create a new gaming experience. PC is exactly the platform to be used first for Eyefinity and DirectX 11 and that’s what keeps the industry pushing forward.

    IHW: While travelling to the fair, we also saw announcements by Lenovo regarding products based on AMD’s Vision platform. Last year, you weren’t exactly well represented on the notebook and netbook market. Is there something new in plan for this year or is it further development and advancement of the existing platform.
    RB: One may expect those two categories to slowly merge into a single one, which makes the segment of ultrathin notebooks extremely interesting. As you’re aware of, last autumn we launched the second generation of our ultrathin platform, while we’re preparing the “Nile” platform for this year, due for presentation in a few months. The moment we present Fusion, we’ll have a remarkable product just perfect for this market segment.

    IHW: Since more and more functions seem to be transferred to the CPU from the motherboard, they become progressively simpler and cheaper to manufacture. How is that fact looked upon by motherboard manufacturers, who have a hard time differentiating their own products from their competitors’?
    RB: That depends on which segment we’re talking about. If we’re talking about notebook motherboards, reduction in the number of components is a good thing, since consumption, heating and space needed to place all the components to ensure functionality are all reduced. If we’re talking about the desktop segment, well, the market seems to have gone that way by itself already. As I already stated before, Fusion was not designed with enthusiasts and users who assemble their PCs by themselves in mind. For lower-end, fewer components mean a lesser price, which is a chief parameter for these products, which will become cheaper once the Fusion technology is implemented.

    IHW: With great experience from both GPU and CPU side of business, what is more difficult to manufacture, an top performance CPU or GPU, and in some detail what kind of effort it takes?
    RB: You should probably ask our friends at GlobalFoundries and TSMC. They can probably provide you with more specifics than I could on what’s involved. Both products truly do have their own unique complexities and I’ll leave it to the manufacturing experts to address that question. What I can say is that GPU product cycles tend to run faster than on the CPU side. AMD is working to incorporate the GPU pace of innovation and execution across all of our products, we call this AMD Velocity.

    IHW: Thank you for your time


    Breaking news:
    Clouds are God's sneezes.

  5. The Following User Says Thank You to Security For This Useful Post:

    ThE NaMeLeSs (24-02-2010)

  6. #4
    The Special Little Boy. Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security's Avatar
    Join Date
    Jul 2007
    Location
    The Netherlands
    Age
    22
    Posts
    9,632

    Awards Showcase

    Downloads
    7
    Uploads
    1
    Blog Entries
    3
    vCash
    1500
    Rep Power
    267

    Default

    Quote Originally Posted by Sean View Post
    Dates and prices?
    Fusion will be out somewhere in 2011, final specifications aren't known yet (out side of AMD) and we won't have to expect price estimates either for a while.


    Breaking news:
    Clouds are God's sneezes.

  7. #5
    The Special Little Boy. Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security's Avatar
    Join Date
    Jul 2007
    Location
    The Netherlands
    Age
    22
    Posts
    9,632

    Awards Showcase

    Downloads
    7
    Uploads
    1
    Blog Entries
    3
    vCash
    1500
    Rep Power
    267

    Default

    Here is a nice analysis of Llano and what AMD have actually changed:


    AMD finally outs the 32nm Llano core
    Source


    AMD IS FINALLY starting to talk about its Fusion CPUs, specifically the first one called Llano. The bad news is that it is not saying very much, but there are some interesting bits that leaked out at ISSCC 2010 in San Francisco.

    All the questions that people wanted answered, how is the GPU integrated, how many shaders, performance estimates and similar metrics sadly were not disclosed. ISSCC is about the circuits themselves, how they are made, and why the techniques are important.

    At a talk entitled "An x86-64 Core Implemented in 32nm SOI CMOS", AMD's R Jotwani answered a lot of questions that started with how, but very few that began with what. Luckily, those are some of the more important bits that rarely get covered in a mainstream CPU launch.

    Some of the 'what' questions were answered, like the fact that Llano uses a mildly tweaked version of the current K10h core found in AMD's Shanghai and Istanbul CPUs. The initial variant has four cores and adds a GPU to the mix. The GPU is based on the current 'Evergreen' DX11 cores, but how many shaders there are is an open question right now.

    One questions we asked AMD representatives at ISSCC was about voltage domains. The GPU on the die has it's own voltage and clock domains so it is very unlikely to be running at the full 3+GHz clocks of the CPU core. Since the shaders are the the same as are found in the current market leading GPUs but are built on a brand new process, and are integrated into the core, the current 800-900MHz range for ATI GPUs does not mean very much.

    That brings up another question, the socket. Looking at the slides leaked by AMD, the chipsets for the higher end Zambezi CPU carry over through 2011. Llano on the other hand goes from an ATI RS880 northbridge and a SB8xx southridge to just a "Hudson-D" series southbridge. Note the lack of any northbridge. This means that Zambezi likely uses the rumored AM3R2 socket, but Llano will almost assuredly use a new socket.

    The core itself is changed a bit, but if you are familiar with the current 45nm K10h parts, you will feel right at home. AMD upped the L2 cache to 1MB per core, up from the current 512K, but it maintains the current 16-way associativity. The instruction window is enlarged to 84 entries so things should be a bit more efficient, and the instruction scheduler is now 30 entries for Integer, 36 for FP.

    Hardware integer divide is said to be improved and latency for FP instructions has been reduced as well. To fill these windows, there is a better prefetcher, cache lines transition between states faster, and memory fill speed is increased. The TLB is also improved for better residency. Although these little details may not seem like all that much, a percent or three here and there adds up to quite noticeable improvements when everything is added up.


    A Llano core (Picture courtesy of the ISSCC)

    The pictures released by AMD show only a single Llano core, not the entire chip, nor do they have any of the uncore, unless you count the L2 as uncore. It is almost as if it doesn't want the interesting parts out yet. That said, each core is more than 35 million transistors and occupies 9.69mm^2, and 110 million transistors and 17.7mm^2 if you count the L2 and power gating ring.

    Llano is built on AMD's new 32nm High-K Metal Gate (HKMG) Silicon On Insulator (SOI) process, and uses 11 metal layers, the same as Shanghai and Istanbul. The only change is that Metal 3 was reduced in pitch, and a lower K dielectric was used. On the silicon side, AMD is using dual strain liners, eSiGe, and some long-channel transistors to increase performance. The process also uses its second generation of immersion tools to draw the pretty lines on the wafers, think underwater basket weaving on a sub-micron scale with multi-million dollar tools.

    Power use is probably the overriding factor in modern chips, and AMD made a lot of changes to Llano to reduce power draw. It officially cites three main architectural changes - core power gating, digital APM (power management), and a clock grid redesigned for reduced power use. On a more granular level, SOI brought some major changes to the circuits themselves.

    One of the most changed circuits is something called a Delayed-onset Keeper. This circuit was necessitated by changes in the electrical characteristics of the 32nm HKMG SOI transistors themselves, since the 'old way' would not work all that well on the new process. The Delayed-onset Keeper improves slack and lessens leakage, but how it works is beyond the scope of this article.

    Another big circuit change is in the L1 cache cell, which moves from a double-pumped 6T design to an 8T design. Mirroring some of the changes that Intel made from the 90nm to 65nm P4's, AMD is trading off a smaller and more complex design for a larger but much more robust one.

    The current cache dates back to the K8, and is double pumped to allow two loads or stores per cycle. While this method works, it is complex and after six or seven years, has become a bit limiting. The solution trades complexity for 33 percent more transistors and the area they consume. Since the L1 is only 128K (64K Data, 64K Instruction), it is noticeable but hardly blows out the die budget. You can see the size on the die shot above.

    Latency does not change, it is still 3 cycles, but the changes should allow the chip to scale to much higher clocks. Since this L1 architecture will likely live on for years to come, the change is almost mandatory to avoid a nasty ceiling on future clock scaling. This was necessary for future cores, Bulldozer in particular.
    On the architectural level, the biggest change is called Core Power Gating, something Intel introduced in Nehalem. The idea is simple, even when off, as long as power can get to a transistor, some of it will get around the gate and be lost. This is conventionally known as leakage, and has become one of the most troubling problems in modern chip building.

    As process geometry shrinks, the silicon gates get smaller, and more electrons get through them. High-K Metal Gates improve this, but don't stop leakage entirely. Until you stop electricity from getting to the transistors, they will always leak a little. A few hundred million 'littles' add up to light bulb territory for lost power and heat generated, something first postulated by Fermi if we recall correctly.

    Intel and AMD have come up with a solution. They put a ring of transistors around the core itself, it is the black border labeled PG ring in the picture above. What it does is when a CPU goes into the new C6 sleep state, all internal data is saved to off-core DRAM, and the core is powered down completely.

    It does not run slowly, the power gates turn off power to it entirely, and then those 110 million transistors stop leaking. This can be a huge power savings, AMD claims a 10-fold reduction in core leakage.

    How it was done is pretty interesting as well. SOI is known to be better at preventing some kinds of leakage, and in this case, it is a huge advantage. AMD can use NFETs for the ring instead of the larger PFETs. Since the ring is 1.38 Million transistors per core, smaller is a good thing.


    Power Gate edge (Picture courtesy of the ISSCC)

    One interesting bit is the ring is not evenly shaped. You would expect that the ring would be between the bumps supplying power to the core and uncore. Drawing a box around the core would not give it enough area for the power bumps needed to feed power to the core. AMD was essentially pad/bump limited if it didn't want to overload the bumps.

    To fix this, the 'ring' was changed from a rectangle to a rectangle with a sawtooth edge. That sawtooth allowed the designers to fit 50 percent more bumps under the ring, giving them the desired safety margin on the bumps. If the numbers were run correctly, AMD will never have an Nvidia-esque "end user usage pattern" moment.

    The next big one is digital power management (APM), basically being able to read how much power the CPU is using on a time scale that is less than "thermal time frames". That is the politically correct way of saying that it will catch power spikes before the magic smoke that makes circuits work is set free.
    Digital APMs were chosen because digital monitoring can provide accuracy within 2% while sampling about 100 signals. The other methods, amperage and temperature monitoring are far less precise. Llano digitally samples 95 separate signals and achieves better than 98% accuracy.

    Temperature measurements are environmentally sensitive while not being repeatable and reliable. Ammeters are better but are still temperature sensitive, and vary on a part by part basis requiring individual calibration. Digital APMs were the only sane course and allow for 'turbo' functionality should AMD choose to implement that on a future core.

    Finally, we have a clock distribution network that is designed to minimize power use. The clock grid is literally a grid of wires that bring the timing clock pulses to circuits that require them. There are thousands of these across a core, so the grid needs to touch almost all of the core.


    Llano clock grid before and after (Picture curteosy of the ISSCC)

    Driving this many high speed signals precisely burns a lot of power, so AMD rearranged the Llano core to cluster things that needed clock signals. The end result is a depopulated grid that barely resembles a grid. The transistors lost massively decrease the clock power used, with a claimed 84 percent drop in clock spine switching power, vastly fewer end clock buffers, and a total of 54 percent less power used for the clocks.

    On top of the reduction in clock buffer count, AMD also gated them in a much finer way than ever before. At MaxPower, Llano has 32 percent of the clocks firing but only 12 percent when halted. This massively drops power when the CPU is sleeping. For the whole core, Llano only uses 68 percent of the static power and 84 percent of the dynamic power of its 45nm predecessor at a normalized clock.

    Overall, it looks like Llano has brought AMD up to par with Intel's Nehalem core for power management, but as you might know, the Westmere cores are out now. According to a talk from Intel shortly before AMD's presentation, Westmere added uncore power gating to the mix, raising the bar a bit.

    It doesn't matter that much in the end, Llano is the last generation of AMD's 'stars' cores, and it will be sent off with a bang. Power is dramatically dropped, clocks are raised, GPUs integrated, and the L1 cache has been updated for the first time since 2003.

    What you see in Llano is not just an x86 with an integrated GPU, but the preview of what to expect in the upcoming Bobcat and Bulldozer CPUs. It is the last of the old line and the first of the new line at the same time, and almost every part of the chip has been updated in some way. I can't wait for AMD to drop the curtain and tell us the rest of the secrets.


    Breaking news:
    Clouds are God's sneezes.

  8. The Following User Says Thank You to Security For This Useful Post:

    ThE NaMeLeSs (24-02-2010)

  9. #6
    The Special Little Boy. Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security's Avatar
    Join Date
    Jul 2007
    Location
    The Netherlands
    Age
    22
    Posts
    9,632

    Awards Showcase

    Downloads
    7
    Uploads
    1
    Blog Entries
    3
    vCash
    1500
    Rep Power
    267

    Default

    What we know about AMD’s next-generation processors

    You might be surprised to learn that AMD is just seven months away from releasing new CPUs based on not one, but three, new designs. The Phenom II that we have known for the past 17 months will soon be put to pasture, never to be seen again. Its replacements are built for the server, the desktop, the notebook and the netbook.

    Dubbed Bulldozer, Bobcat and Llano, the new processor designs are the final piece of AMD’s grand strategy to emerge from years of debt and struggle as a leaner, meaner company. For enthusiasts, they are something altogether more important: a clear sign that the fascinating war between AMD and Intel is about to go nuclear once again.

    Bulldozer: the chip for enthusiasts


    Chips based on Bulldozer will be scalable across any number of what AMD calls “modules” (shown above), each of which contains two CPU cores. It is postulated that each module is equipped with a technology called Cluster-Based Multi-Threading, or CMT.

    To understand CMT, we must first have an understanding of its lesser sibling, Symmetric Multi-Threading (SMT), which you are likely to know by Intel’s name: Hyper-Threading. Though Intel did not create the technology, their implementation is by far the most famous.

    Intel’s implementation of SMT duplicates architectural states—the part of a CPU which holds the condition of a process—but not the execution engine. This allows their processors to maximize execution resources by busying silicon that would otherwise lay idle, or by injecting threads into the pipeline in the event of a stall.

    To give a real-world analogy, Intel’s implementation of SMT is similar to an automobile assembly plant with only one assembly line capable of taking a car from parts to completion. At every stage of the assembly, however, workers are standing by with completed parts to keep the line moving if there’s a problem. The workers can’t build a car (they don’t have a line), but they can make sure that line is always moving the car on to the next step without issue.

    Intel uses SMT in the same way: to ensure that the processor’s line is always busy moving to the next step, and today’s operating systems are increasingly intelligent at dispatching threads for this setup.

    The “problem” with this implementation of SMT is that one instruction window tracks the dispatch, execution and retirement of both threads. Going back to the assembly line, it would be like putting one supervisor in charge of watching the line and the workers—that supervisor can’t watch for problems with the line and the workers at the same time. Something is bound to fail. On a CPU, as in an assembly line, failures lead to a reduction in apparent performance.

    Each Bulldozer module, meanwhile, puts the plant on steroids not only by adding a second fully-functional assembly line, but by giving each line the ability to break one big stage down into several, parallel stages—little assembly lines that can be created, run, merged and closed on demand without sacrificing the efficiency of the main assembly line. This is CMT, and the Bulldozer can do it.



    When a processor is done sending calculations through the pipeline, it stores that data in cache for programs to access (L1 DCache in the diagram below). In essence, these are the completed cars sitting in the parking lot waiting for transport. Intel processors have one parking lot that may contain a mix of cars and trucks, which reduces efficiency when a shipping company arrives to grab a shipment made exclusively of trucks. The Bulldozer plant has two parking lots, which gives that plant more flexibility to be efficient with storing and shipping.

    From end to end, the entire Bulldozer plant can do more, and do it more intelligently than the plants AMD and Intel run today.



    Going back to raw architecture, both of Bulldozer’s lines share a single floating point scheduler (cordoned in red), with two 128-bit FMAC pipelines. Fused multiply-accumulate (FMAC) gives the chip improved floating point precision, which grants Bulldozer a leg up on the Phenom II when it comes to calculating big equations more accurately and efficiently. And, when you realize that everything you do on a computer is a mathematical equation, you can see why this is important.

    A 128-bit floating point pipe is also a natural choice as AMD has announced SSE5 for the Bulldozer, an instruction extension that has several 128-bit multimedia instructions. Fusing the 128-bit FPUs will also allow the chip to crunch 256-bit Intel AVX instructions in just one cycle. SSE5 and AVX alone will take these processors to a whole new level of performance when it comes to multimedia, encryption and scientific research.

    Finally, the Bulldozer brings forward the Phenom II’s cache hierarchy by dumping all the pipelines into shared pools of L2 and L3 cache. These shared L2 and L3 caches give either core on a Bulldozer module access to completed calculations that can be pulled back in to speed up a new task. This is standard for today’s processors.

    Your future Bulldozer CPU
    The first enthusiast CPU to employ the Bulldozer design is currently codenamed Zambezi, and it will contain four of these dual core modules for a total of eight cores. We also know for a fact that Zambezi will use socket AM3, meaning anyone with a DDR3 Phenom II motherboard will be ready to rock with a BIOS upgrade.


    What about performance?
    Unfortunately, there are some elements of the Bulldozer design that we just don’t understand yet, including:
    • How many cars the supervisor can send down the line at a time;
    • How many stages it takes to complete a car;
    • How AMD has configured the floating point unit (FPU) to run the numbers;
    • And how exactly AMD shares the single FPU amongst two independent assembly lines.
    Until this information tips up, we just can’t know how Bulldozer will compare to today’s processors. In the interim, we can only admire the genuinely different architecture and speculate over the diagram’s many ambiguities.

    Bobcat: the chip for netbooks
    Next on the launch deck is AMD’s “Bobcat” architecture, a chip explicitly designed to cater to products containing CPUs like the Athlon Neo or the Intel Atom.

    According to the company’s roadmaps, the first chip to launch with Bobcat architecture will be the 32nm Ontario APU, which combines two Bobcat modules and a rudimentary DirectX 11 chip on the same processor.



    Each Bobcat module is a single core design, with one supervisor (int scheduler) and one assembly line, which consists of the I-Pipes, Ld-Pipe and St-pipe in the diagram above. These can be considered specialized workers—electricians versus mechanics, for example—that perform unique tasks on the car while it is rolling down the line. You’ll note that Bulldozer, too, had four pipelines per int scheduler, but we just don’t know what kind of workers they are yet.

    The Bobcat’s integer pipe is paired with a dual-pipe FPU, ambiguously titled “A-Pipe” and “M-Pipe” in this diagram. We postulate that the “A” and “M” refer to the addition and multiplication/division floating point operations, respectively. The size of these pipelines—the number of bits they can calculate at a time—will not only determine what this processor is strongest at, but its complexity, and how it consumes power.

    On the topic of power, AMD claims that Bobcat is capable of radiating less than 1 watt of heat, which could mean something around 0.5W. A chip at that wattage isn’t doing much more than sitting around on standby, but it’s a healthy number for users looking for laptop designs with a long standby life. In practice, Bobcat’s actual TDP should be around 5-10W, which is perfect for netbook-sized laptops.

    On the point of performance, AMD says it’ll weigh in at “90% of today’s mainstream performance” at less than half of the die size. If AMD’s definition of mainstream is the Athlon II—an assumption that bears out in their platform roadmaps—then Bobcat is essentially an Athlon II in a (much) smaller, cooler and quieter package. Not bad.

    Bobcat’s most remarkable feature is not its architecture, however, but its design process. AMD has designed the Bobcat via high-level synthesis, or HLS. HLS is a process by which a chip’s design begins its life as a set of behaviors coded by a programmer in C++. The code is then interpreted and synthesized by a machine that manufactures a processor that exhibits the behavior written by the programmer.

    HLS is a fascinating way to rapidly design and produce a chip that can easily be modified or ported to other processes for outstanding flexibility in the market. The trade off for this agility is frequency—Bobcat’s maximum clockspeed with an HLS-driven design is about 20% lower than it could have been were it designed “by hand.”

    All things considered, Bobcat will assuredly be faster than any ultra low-voltage chip in the market today; it will handily eclipse the Nano, the Atom and the Athlon Neo, by orders of magnitude on some metrics. Additionally, AMD’s decision to roll with HLS gives the firm the ability to respond to market conditions in ways its competitors simply cannot with current processes.

    Fusion: the chip for notebooks and budget desktops
    AMD’s acquisition of ATI Technologies was completed on October 26, 2006 and was accompanied by an official, and very important statement:
    AMD plans to create a new class of x86 processor that integrates the central processing unit (CPU) and graphics processing unit (GPU) at the silicon level with a broad set of design initiatives collectively codenamed “Fusion.”
    In other words, AMD announced that it would soon put GPUs and CPUs on a processor. AMD calls these chips an accelerated processor unit, or APU. If you’re familiar with the CPU market, the APU might not be new to you: some of Intel’s Core i5 processors have a GPU onboard. Yes, Intel beat AMD to the punch, and it was almost a direct result of AMD’s financial hardship.

    Despite yielding the first design wins to its chief rival, there is a silver lining for AMD’s APU initiative: even AMD’s slowest modern GPU bloody annihilates anything Intel has to offer. This includes the GPUs AMD plans to stick inside its processors, starting next year with Llano.

    Llano
    The Llano CPU is AMD’s first processor scheduled to adopt the Fusion APU design. Based on the die shots provided earlier this year, the chip strongly resembles an Athlon II X4 that has been shrunk from 45nm to 32nm to accommodate an onboard GPU.

    This would make perfect sense given that Llano and Propus are both oriented for the mainstream. Marrying existing technologies manufactured at a smaller size is much easier than starting over with a brand new architecture when none is needed.


    Propus (left) Llano (right)

    It is certainly worth noting that the above x-ray of the Llano is not complete; the bottom section of the chip has been cut off in press materials, meaning there’s even more silicon at play than we can see at this time.

    However, judging from what we can see, the Llano APU will feature 512k-1MB L2 cache per core, no L3 cache and six Radeon HD 5000-series units for a total of 480 stream processors.

    In short, Llano is shaping up to be an Athlon II X4 with 66% of a Radeon HD 5750 on board. If that bears out, then it is more than capable slugging Intel’s Clarkdale and Arrandale (Core i5) designs into the pavement without lifting much more than a few fingers.

    Recap
    Before we head into our final thoughts, let’s take a moment to quickly summarize all the architectures that have been tossed around in this article.

    Zambezi
    Family: Bulldozer
    Cores: 4 to 8
    Process: 32nm
    Socket: AM3
    Onboard GPU?: No
    Platform: Scorpius
    Role: Performance Desktop
    Launch date: Late 2010

    Ontario
    Family: Bobcat
    Cores: 2-4
    Process: 32nm
    Socket: N/A
    Onboard GPU?: Yes
    Platform: Brazos
    Role: Ultra Thins, Netbooks
    Launch date: 2011

    Llano
    Family: Stars (Athlon II)
    Cores: 4
    Process: 32nm
    Socket: N/A (AM3 rumored)
    Onboard GPU?: Yes
    Platform: Brazos
    Role: Mainstream notebook, mainstream desktop
    Launch date: 2011

    Final thoughts
    AMD has been saying that “the future is Fusion” for years, and the company is just now in a place with its capital and processes to realize that future. By 2011, AMD will completely revamp their desktop, laptop and netbook offerings with three innovative and purpose-built CPU designs, all of which can be paired with on-die GPUs if the market demands it.

    You read that right: Llano isn’t the only design that can support an onboard GPU. AMD can pair Bulldozer and Bobcat modules with a GPU, too.

    Now, AMD’s first generation Fusion won’t have the performance to take on the discrete GPU market, but the groundwork is being laid. It will start with mainstream and low-voltage in laptops and netbooks, respectively. Economical desktop designs aren’t out of the question either, but there are signs that something much bigger is in the works.

    For example, Bulldozer may not be an APU now, but its relatively small floating point unit speaks to a future architecture that cedes floating point operations entirely to the GPU, a component that crushes the CPU in floating point performance.

    And indeed, in conversations with AMD, this is the paradigm they have been working to kickstart: a computing ecosystem that recognizes CPUs and GPUs alike as valid processors for a program. They envision a day when processing tasks are easily and automatically sent to the best processor for the job.

    We are just beginning on that road, the one that blurs the line between the CPU and the video card, but AMD appears poised to make a confident first step. They have the resources, they have the engineers, and they have the drive. AMD is extremely passionate about where they’re going with their market strategy; talking to engineers and representatives at all levels of the company reveals an infectious enthusiasm that can’t be manufactured or faked.

    Do not believe for a moment that competition between AMD and Intel has waned: 2011 will be more exciting than ever.


    Correction (5/19/2010): Astute readers have noted that we erroneously attributed socket C32 to the Bobcat, whose true socket remains unknown at this time. The story has been updated to reflect more current information.


    Source.
    Last edited by Security; 24-05-2010 at 01:41 AM.


    Breaking news:
    Clouds are God's sneezes.

  10. The Following User Says Thank You to Security For This Useful Post:

    ThE NaMeLeSs (25-05-2010)

  11. #7
    Underboss Panther_Seraphi is an unknown quantity at this point Panther_Seraphi is an unknown quantity at this point
    Join Date
    Dec 2008
    Location
    Birmingham, Midlands
    Age
    21
    Posts
    719

    Awards Showcase

    Downloads
    2
    Uploads
    0
    vCash
    500
    Rep Power
    3

    Default

    Im worried about memory bandwith of the Llano implementation. As everyone knows games are a bandwith whore and if they are using the same bus as the CPU for GPU calcualtion they are going to need some serious bandwith. Most Graphics cards have tens if not hundreds of gigabytes for memory transfers so whats gonna happen if that gets cut down to barley 15-20?

  12. #8
    The Special Little Boy. Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security's Avatar
    Join Date
    Jul 2007
    Location
    The Netherlands
    Age
    22
    Posts
    9,632

    Awards Showcase

    Downloads
    7
    Uploads
    1
    Blog Entries
    3
    vCash
    1500
    Rep Power
    267

    Default

    Well this new generation will be having full quad channel memory support which will give a 200% boost without the need for higher memory and bus clocks.
    Then there will be support for 1866Mhz memory added to the AM3 and AM3R2 sockets allowing for another boost of memory bandwidth by roughly 15%.

    Tbh. I have no idea if that boost will be enough but it is a nice start and since many aspects of the chips are still unknown all we can do is hope.


    Breaking news:
    Clouds are God's sneezes.

  13. #9
    mwahahaha ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs's Avatar
    Join Date
    Sep 2005
    Location
    Hell
    Age
    21
    Posts
    8,359

    Awards Showcase

    Downloads
    1
    Uploads
    0
    vCash
    500
    Rep Power
    216

    Default

    Damn i already thought about getting an am3 board (plus x6 and ddr3 ram) later this year. Now they say Bulldozer gets out late this year i may have to skip the x6er 1kT cpus :P

    Im really looking forward how this will turn out, atmost i already like that (if its really true) having quad channel, even more it will have some sort of "hyper-threading" like intel, actually its like how all people complained about and now amd delivers.
    i40 in 3 sentences by Garner: 'i once was in a gaypub' 'They were all pissing over me' '*ultimate tent-vomit-bomb*'
    i40 friday night, getting asked by jenny for quarter hour: 'why are you coming all this way here and then be such a fucking dick?'
    i40 saturday morning, Pen0r: 'Goddamn, my butt burns so much'
    Klaus: Achievement Unlocked: IRL Troll

  14. #10
    The Special Little Boy. Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security's Avatar
    Join Date
    Jul 2007
    Location
    The Netherlands
    Age
    22
    Posts
    9,632

    Awards Showcase

    Downloads
    7
    Uploads
    1
    Blog Entries
    3
    vCash
    1500
    Rep Power
    267

    Default

    AMD's HTT will be different tho, they will be having certain parts of a CPU core twice so there HTT is more efficient performance wise (but the chip gets more expensive as well) where Intel simply makes use of the inactive parts of a core.


    Breaking news:
    Clouds are God's sneezes.

  15. #11
    mwahahaha ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs is on a distinguished road ThE NaMeLeSs's Avatar
    Join Date
    Sep 2005
    Location
    Hell
    Age
    21
    Posts
    8,359

    Awards Showcase

    Downloads
    1
    Uploads
    0
    vCash
    500
    Rep Power
    216

    Default

    AMD Bulldozer architecture to be revealed in August
    AMD is working on a new x86 chip architecture code-named Bulldozer. The architecture will be used in chips manufactured using the 32-nm process. The company scheduled a 16-core chip code-named Interlagos for release in 2011. Bulldozer is AMD's first properly new processor architecture since the Athlon 64 of 2003. Every AMD chip since then has been a variation on that theme. A tweak here, an added core or floating point unit there, perhaps. But basically the same design.

    Not so for Bulldozer. It's a genuinely novel architecture. Novel enough, in fact, tomake describing it something of a semantic assault course.

    Instead of traditional execution cores, Bulldozer chips will be made up of one or more "modules". Each module packs a pair of integer units and a single shared floating-point resource. The latter is actually a pair of 128-bit FMACs, but lets not get ahead of ourselves.

    AMD will disclose more details about its forthcoming code-named Bulldozer micro-architecture at Hot Chips conference in late August. Potentially, micro-architectural details may reveal projected performance of the forthcoming multi-core central processing units.

    In the program of the Hot Chips conference AMD itself describes Bulldozer core as “a new approach to multithreaded compute performance for maximum efficiency and throughput”, which means that the forthcoming core does include a multi-threaded technology, which may be completely different from implementations from companies like Intel Corp. or Sun Microsystems. AMD plans to present the Bulldozer details on the 24th of August, 2010 [via xbit].
    Source

    thought i might add
    i40 in 3 sentences by Garner: 'i once was in a gaypub' 'They were all pissing over me' '*ultimate tent-vomit-bomb*'
    i40 friday night, getting asked by jenny for quarter hour: 'why are you coming all this way here and then be such a fucking dick?'
    i40 saturday morning, Pen0r: 'Goddamn, my butt burns so much'
    Klaus: Achievement Unlocked: IRL Troll

  16. #12
    The Special Little Boy. Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security's Avatar
    Join Date
    Jul 2007
    Location
    The Netherlands
    Age
    22
    Posts
    9,632

    Awards Showcase

    Downloads
    7
    Uploads
    1
    Blog Entries
    3
    vCash
    1500
    Rep Power
    267

    Default

    Slightly old but hasn't been posted yet:

    AMD brings Ontario Fusion forward, delays Llano
    AMD has brought forward the launch of its Fusion offering codenamed Ontario and at the same time has delayed its Llano Fusion chip. Llano is delayed because of 32 nanometre yield problems. Llano is being built by GloFo (GlobalFoundries) on a 32 nanometre process while Ontario will be built by TSMC on a 40 nanometre process.

    Dirk Meyer, CEO of AMD, speaking at an analyst conference call, said that Ontario, an APU which includes the Bobcat CPU core, "will be a game changer". He said that AMD expected Ontario to be the first Fusion product to come to market, will ship in the fourth quarter of this year, ahead of schedule.

    But AMD is delaying its Llano Fusion offering - that's because of insufficient yields on 32 nanometres. Llano, Meyer said, will be pushed back a couple of months but shipments will happen in the first half of 2011.

    He said that AMD had switched its efforts to Ontario with its timeline "changing quite dramatically"

    AMD, said Meyer, has three different Fusion designs in four packages for notebooks and desktops. Ontario is targeted at low cost, low power netbook and small form factor category. There will be two designs under the Llano codename. Ontario will ship for revenues in the fourth quarter.

    There's still robust demand for enterprise servers, said Meyer. But Magny Cours only became available in June. On the client PC side, that's AMD's lowest priority, said Meyer.

    Meyer said demand for GPU offerings in the quarter were very strong, but were constrained by supplies. He said that in the second half of the year, demand will remain healthy and AMD expects GPU constraints to ease.

    He said that AMD expected to see server products start to deliver serious revenues in the third quarter.

    AMD, he said, had more than tripled the number of Vision branded products. He claimed that Vision was the most successful launch in AMD's history with 130 design wins across multiple prices.

    He said that the Bulldozer core will sample second half of this year on track for launch next year.
    Source.


    And since this is Intel's competitor for Fusion:

    Intel to limit Sandy Bridge overclocking?
    IF WE'RE TO believe what is meant to be Intel presentation slides of its upcoming Sandy Bridge processors that were embedded in a video posted on YouTube by HKEPC, it looks like Intel's LGA-1155 processors will have very limited overclocking potential. The reason for this is because Intel decided to "help" with the cost cutting by implementing a clock generator built into the chipset, rather than relying on an additional chip on the motherboard.



    However, by doing so, Sandy Bridge processors on the LGA-1155 platform won't be easily overclocked as the way Intel implemented the clock generator means that all the busses are tied to it. The end result of this is that if you try to increase BCLK you'll also increase the speed of all other busses in the system, such as USB, SATA, PCI Express, DMI etc. Not exactly a great implementation, at least not for anyone that's interested in overclocking their system as Intel claims that you won't be able to push the bus by more than two to three percent.



    There appears to be another underlying reason for this, Intel wants to sell more expensive CPUs to overclockers. The company is getting ready to launch more K-series processors with unlocked multipliers specifically for overclockers, although we're not sure that the overclocking scene will be all that tempted, as you can only do so much with an unlocked multiplier when you can't move the bus speed. Judging by the slides, Intel will offer fully unlocked and partially unlocked processors, where the fully unlocked models appear to be similar to today's XE processors. Intel's Turbo feature will of course work as it does today on its Core iSomethingMeaningless processors and Intel has also added native support for DDR3 memory overclocking.



    Things gets a little bit trickier when you realise that the memory multiplier is only unlocked when you're using Intel's P-series chipsets such as the upcoming P67. We'd guess that Intel will charge a premium for this chipset compared to the H67 chipset. Memory speeds of up to 2133MHz are supposedly supported which is a huge improvement over the P55 chipset, yet not nearly as useful.
    The LGA-2011 platform on the other hand works quite differently as Intel has fixed the PCI Express and DMI bus speeds inside the CPU which means that it's possible to push the BCLK on these processors without running into the same issues you would on the LGA-1155 platform. However, beyond Intel's Turbo feature, you're not getting any freebies here as no XE processors are multiplier locked and there are no K-series processors on LGA-2011. Again, DDR3 overclocking is part of the package here too, but all the way up to 2666MHz and beyond.





    It seems like Intel isn't happy with so many of its customers buying slower processors and overclocking them easily and this is Intel's way of telling the world this. We can't but wonder if it's really worth it for Intel, especially if AMD manages to get a competitive part or two into the market. Competitive overclockers are likely to shun the LGA-1155 platform altogether, as it wouldn't offer much in terms of a challenge to overclock and it's not going to be much of a competition if everyone manages to reach the same speeds.
    There doesn't seem to be any simple way around Intel's clock generator implementation either, but if a motherboard manufacturer manages to find a workaround, it will put that company way ahead of the competition. We're aware that the Taiwanese motherboard engineers are very clever people, but this seems like a near enough impossible nut to crack, but luckily there are still a few months to work on removing Intel's latest spanner in the wheel for overclockers.
    Source.


    Breaking news:
    Clouds are God's sneezes.

  17. #13
    Commissario Hacienda is on a distinguished road Hacienda is on a distinguished road Hacienda is on a distinguished road Hacienda is on a distinguished road Hacienda is on a distinguished road Hacienda is on a distinguished road Hacienda is on a distinguished road Hacienda is on a distinguished road Hacienda is on a distinguished road Hacienda is on a distinguished road Hacienda is on a distinguished road
    Join Date
    Jan 2006
    Location
    Blackpool
    Age
    32
    Posts
    2,325

    Awards Showcase

    Downloads
    0
    Uploads
    0
    vCash
    500
    Rep Power
    120

    Default

    If the above is true (sandybridge cpu clock linked with pci bus) thats the single most stupid thing I've ever heard, but I guess they can then legitimatise keeping older products that do overclock well at prices way over the true value.

    As much as I have no alligence to either company the more intel take the lead in the high end cpu market the more their greed is showing and for consumer benefit I hope AMD close the gap sooner rather than later.

  18. #14
    Underboss Panther_Seraphi is an unknown quantity at this point Panther_Seraphi is an unknown quantity at this point
    Join Date
    Dec 2008
    Location
    Birmingham, Midlands
    Age
    21
    Posts
    719

    Awards Showcase

    Downloads
    2
    Uploads
    0
    vCash
    500
    Rep Power
    3

    Default

    Sounds like Intel is trying to do a Nvidia but doing it where AMD is starting to have an answer to their chips.

  19. #15
    The Special Little Boy. Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security's Avatar
    Join Date
    Jul 2007
    Location
    The Netherlands
    Age
    22
    Posts
    9,632

    Awards Showcase

    Downloads
    7
    Uploads
    1
    Blog Entries
    3
    vCash
    1500
    Rep Power
    267

    Default

    AMD Ontario less uses then 18-25 watt?
    This week information regarding the AMD Fusion processor, which presumably will appear first on the market, codenamed Zacate.
    It was explained that this chip probably will be part of the Ontario family, which underlies the Brazos platform for mobile machines, and a TDP of 18 to 25 watts would get.
    Meanwhile, a new source is at X-bit Labs reported and noted that such details are far from accurate.

    The source, who reportedly spends in circles close to AMD, suggests that Ontario certainly not a TDP of 18 watts to 25 will meekrijgen. Exact values were not released, but was spoken of significantly lower numbers. Ontario chips will to make use of the Bobcat architecture, the architecture which AMD has worked for several years and the answer of the company for energy-efficient equipment. Ontario is able to limit the TDP of less than 1 watt to 10 watts and will thus go up against Intel's Atom. This means that we are going to encounter Ontario chips in netbooks, tablets and other ultra-portables.

    ...
    Original source.
    Translated source.
    Last edited by Security; 30-07-2010 at 12:41 PM.


    Breaking news:
    Clouds are God's sneezes.

  20. #16
    The Special Little Boy. Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security will become famous soon enough Security's Avatar
    Join Date
    Jul 2007
    Location
    The Netherlands
    Age
    22
    Posts
    9,632

    Awards Showcase

    Downloads
    7
    Uploads
    1
    Blog Entries
    3
    vCash
    1500
    Rep Power
    267

    Default

    What is Bulldozer
    As we start the “Bulldozer” Blog, it is important to make sure that everyone is grounded on exactly what this planned product is; this will help you understand the next dozen or more blogs that will be published over the upcoming weeks and months.

    Bulldozer is the code name of one of our two next generation core architectures, the other being “Bobcat”. The AMD Opteron™ processor family has been built over the past 7 or so years from a common core architecture that has grown over time from a single core design to today’s power-efficient, 64-bit, 12-core, virtualization-aware processors.

    This new generation of processors is being designed with some new technologies that will help make these processors more efficient, higher performing and more power optimized than anything we’ve offered to date.

    The platform changes that AMD introduced in 2010 were very deliberate and implemented with an eye toward Bulldozer. Simply stated: Bulldozer is being designed so as to not require AMD customers to change from their current AMD Opteron™ 4000 & 6000 Series platforms. Our new AMD Opteron™ 6000 Series platform (G34 socket-based) and our new AMD Opteron 4000 Series platform (C32-socket based) are compatible with the new Bulldozer products we plan to introduce in 2011. This means that the 6000 series will be an ideal home for the upcoming “Interlagos” (16-core) processor, with the 4000 series being equally well-suited for the upcoming 8-core “Valencia” processor.

    The AMD Opteron™ 6000 series platform was designed to handle both the AMD Opteron™ 6100 series (code named “Magny- Cours”) as well as the future processors based on the Bulldozer core. Obviously until we have the final silicon in hand we can’t make any claims, but it is our expectation that customers will have a much easier time managing multiple generations of processors because we expect the underlying platform to be the same.

    Bulldozer is being designed to support DDR-3, just like today’s platforms. All of the variations that we see today (standard, low power, registered and unbuffered) will be joined by two new options: Load Reduced DIMMs, and a new 1.25V low power option (lower than today’s 1.35V). Capacities and speeds will be driven by the market more than our platforms. We’ve designed a platform specification that supports higher speed and higher capacity than what we offer today, but we do have to be realistic – our technology partners will probably support those options that are JEDEC compliant and commercially available at the time of launch.

    Now, what aren’t we going to talk about? Well, there is always that set of questions that we get asked over and over again, but we reserve the data for launch. So, let me save you some time on asking:

    Performance: We release benchmarks at launch, so don’t expect too much detail there anytime soon. From a performance standpoint, if you compare our 16-core Interlagos to our current 12-core AMD Opteron™ 6100 Series processors (code named “Magny Cours”) we estimate that customers will see up to 50% more performance from 33% more cores. This means we expect the per core performance to go in the right direction — up. That is all I will say until launch.

    Pricing will be available at launch as with all of our other products.

    Launch date is currently set for sometime in 2011. I do realize that this is a wide range, but as we get closer to launch, we’ll narrow down the window a bit. Product development milestones will not be delivered through this blog for competitive reasons I’m sure you can appreciate. If we do release any schedule milestone achievements, we’ll let Dirk or someone from the engineering team have that honor.

    This should get you up to speed with Bulldozer, stay tuned for more updates on a regular basis.
    Source.


    More Sandy Bridge performance numbers
    Compared to Bulldozer there already is a nice collection of benchmark numbers for Sandy Bridge. For example those posted by Coolaler, a few BOINC benchmark results and a video with a mobile Sandy Bridge running Cinema 4D. The video analysis done in the Planet3DNow forums resulted in a deciphered score of 19641, confirmed by the measured run time (44 s). This means, the tested mobile Sandy Bridge processor was as fast as a Core i7-975 Extreme. Another comparison could be done by using a recently published Geekbench result of a 1.6 GHz Sandy Bridge CPU. So I compared it to a Core i7 also running at 1.6 GHz and made following table with overall results and a diagram showing the differences in detail.



    So the average performance increase with those CPUs at the same base clock, but with different Turbo Boost implementations, is about 20%. In the diagram below we can see a significant average difference in multi-threaded benchmarks:

    Source.


    Breaking news:
    Clouds are God's sneezes.

  21. #17
    Underboss Panther_Seraphi is an unknown quantity at this point Panther_Seraphi is an unknown quantity at this point
    Join Date
    Dec 2008
    Location
    Birmingham, Midlands
    Age
    21
    Posts
    719

    Awards Showcase

    Downloads
    2
    Uploads
    0
    vCash
    500
    Rep Power
    3

    Default

    So they are comparing it to a mobile i7? How does that work? I believe a moot point because how do you know its not platform holding the chip back?

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

     

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts