Table Of Content

Bandits meet Computer Architecture: Designing a Smartly-allocated Cache YonatanGlassner KobyCrammer ElectricalEngineeringDepartment TheTechnion-IsraelInstituteofTechnology 6 Haifa,Israel32000 1 [email protected],[email protected] 0 2 February2,2016 n a J 1 Abstract cache to the threads in a way to maximize hit-rate, 3 namelythefractionofmemorycallsgetansweredby ] In many embedded systems, such as imaging sys- the (fast) cache, and not the (slow) memory. An ef- G tems, the system has a single designated purpose, fective allocation would take into consideration the L and same threads are executed repeatedly. Profiling various requirements of the threads and their behav- . s thread behavior, allows the system to allocate each ior when memory lacks. However, the exact nature c [ thread its resources in a way that improves overall of this behavior is unknown, and should be learned systemperformance.Westudyanonlineresourceal- from experience. The repeatability of such systems 1 v locationproblem,wherearesourcemanagersimulta- provides opportunity for good threads identification 9 neously allocates resources (exploration), learns the (hit rate as function of given resources), which al- 0 impactonthedifferentconsumers(learning)andim- lowstheresourceallocatortoapproachoptimalallo- 3 0 proves allocation towards optimal performance (ex- cations. 0 ploitation).Webuildontherichframeworkofmulti- . The problem of making decisions under uncer- 2 armed bandits and present online and offline algo- tainty,orpartialknowledge,wasstudiedextensively 0 rithms. Through extensive experiments with both 6 intheliterature,andapopularmodelforthisproblem 1 synthetic data and real-world cache allocation to istheMulti-ArmedBandit(MAB)[1].Inoursetting, : threadsweshowthe meritsandpropertiesofoural- v on each iteration the decision maker must choose i gorithms. X an allocation of the available resources (memory) r for the threads to be executed, corresponding to the a 1 Introduction ’arms’intheMABmodel.Subsequently,allthreads are executed (arm is pulled), and a stochastic hit rate (reward) is observed for each thread. A suitable Consider a real-time X-ray system used for surgery. framework for our setting is the combinatorial ban- Such a system performs extensive real time image dits(akaCMAB)framework,underfullinformation processingofastreamofimages,andisrequirednot feedbacksettings. to have delays, nor loose frames. In practice such a systemexecutesmanythreads(suchasFFTonvari- A typical relation between the memory allocated ous parts of the image) on a few cores and a shared to a thread and its hit-rate is presented in Fig. 1 (we cache,whichallowsfastaccesstomemory.Sincethe present the hit-rate vs the amount of memory allo- amountofcacheislimited,thereisaneedtoallocate cated for two applications: gcc and bzip, which are 1 part of the CPU SPEC 2006 benchmark). This rela- In their fundamental paper, Auer et al. [4] pre- tion is stochastic, and, its mean is characterized by sented the UCB1 (upper confidence bound) algo- two important properties: it is monotonic in the al- rithm. On each iteration, the algorithm chooses the locatedmemory,andexhibitsa’diminishingreturns’ armwiththehighestUCBoftheestimatedexpected phenomena.Notethatthememory–expectedhit-rate value. Their key-method performs the exploration- relation is clearly non-linear, thus discouraging the exploitation tradeoff implicitly. They prove that the useoflinearbanditsbasedapproaches. number of time steps a sub-optimal arm is played Many existing approaches for this problem are isbounded,yieldinglogarithmicregret(performance static and hard-coded into hardware (see Liu et al. difference of an algorithm and optimal policy) uni- [2]). Moreover, they ignore the specific characteris- formlyforallfinitetimes. tics of the threads. Instead, we propose to learn the Another line of research is when the algorithm statisticalnatureofthreads,anduseittomakeagood may choose more than a single arm, and there dynamic allocation. We study the empirical proper- is a dependency between the arms. See again the tiesofabenchmarkapplications.Wesuggestapara- manuscript [3] for details and examples. The Com- metricmodelwiththesamequalitativepropertiesas binatorial MAB (CMAB) is a special case of MAB. the real data. The model parameters are estimated Here, arms (sometime also called super arms) are a duringrun-time.However,allocationsaremadedur- combination of basic arms, chosen from a finite set ingtheprocess,eveniftheestimatesareonlyrough. F. Therefore, there is a structured dependency be- The rest of the paper is organized as follows: in tween super arms. Ignoring that structure and using Sec. 2 we describe related work in the field of multi traditional algorithms yields poor performance (see armedbandits.Inaddition,webrieflyproviderelated e.g.theworkofGaietal.[5]),comparedtostructure- work in the field of dynamic resource allocation in consideredalgorithm. computer systems. In Sec. 3 we formulate our prob- In a more recent paper, Chen et al. [6] proposed lem and the modeling function chosen with respect ageneralCMABformulation.Oneachstep,asuper to the formulation. In Sec. 4 we introduce our algo- arm, which is a subset of arms, is chosen out of a rithms together with their analysis of expected per- finite subset group F ∈ 2[m], where 2[m] is the set formance. Empirical study with both synthetic and of all possible subsets of arms, taken from m basic real data is summarized in Sec. 5, and we conclude arms. The expected reward is a general function of with Sec. 6. Due to lack of space, some Technical the set of arms played and expected performance of materialandproofsarenotprovidedinthispaper. all arms. In their algorithm, they assumed the exis- tenceofanOracle,whichprovidesagoodsuperarm 2 Related work withhighprobability.Inourspecificproblem,wedo not assume such an oracle exists, thus can not use their framework directly. We rather provide a self- The MAB problem is widely studied these days, contained algorithm, which senses the environment, where different formulations model various exploration-exploitation (aka exp-exp) tradeoff providespredictionsandacts. alternatives. We clearly can not review all variants, Inamorepracticalaspect,resourceallocationwas and refer the reader to a recent manuscript in the investigatedinthefieldofcomputerarchitecture(see area[3]. the work of Liu et al. [2] and of Suh et al. [7] for Lai and Robbins [1] proposed one of the earli- cache hit rate optimization, and of Bitirgen et al. estMABversion,inwhichthereareN independent [8] for global resources optimization using Artifi- arms, each producing stochastic i.i.d rewards, taken cialNeuralNetwork).However,weareinterestedin fromaknownfamilyofdistributions,withunknown providingageneralframeworkfortheresourceallo- parameters.Theobjectiveistochoosearmssequen- cation problem, rather than a fine-tuned per domain tially,soastomaximizethetotalreward. practicalsolution. 2 Recently, Lattimore et al. [9] studied a similar re- 1. The expected reward of the algorithm at time t is source allocation problem, where a system manager givenby, allocated resources to maximize system gain. How- (cid:34) N (cid:35) ever, they assume a piece-wise linear cut-off model, E (cid:88)s = ρ(cid:16)f(cid:126),m(cid:126) (cid:17) . t,i t defined by a single parameter, which is the change i=1 point between the linear and constant range. We use Similarly, the optimal expected reward is given by, a different model which we believe is closer to real (cid:16) (cid:17) world behavior. Specifically, we assume a nonlinear ρ f(cid:126),m(cid:126)∗ . The expected instantaneous regret is function, controlled by two parameters to model the defined to be the difference of rewards, R(t) = (cid:16) (cid:17) (cid:16) (cid:17) consumer gain. Our more complicated model yields ρ f(cid:126),m(cid:126)∗ −ρ f(cid:126),m(cid:126) ,andthecumulativeexpected t differentalgorithmswithdifferentbehavior. regret is the sum of instantaneous regrets, R = (cid:80)T R(t). The goal of a learning algorithm is to t=1 3 Problem Setting minimizetheexpectedcumulativeregret. It remains to define a family of parametric re- We now describe formally the memory to threads wardfunctionsF fromwhichf willbechosen.We i allocation problem. There are N threads (or arms) restrict our discussion to families with two natural which share M identical units of memory. We will properties, which are inherent for the resource allo- next assume, by a simple normalization, that all re- cationmodel: sourcesaresummedto1.Oniterationtanallocation Monotonicity: Allocating more memory does not algorithmpartitionsthememorytothethreads,allo- decrease the expected reward, that is f(m ) ≥ 1 catingm ∈ [0,1]fractionofthememorytothread f(m )form ≥ m . t,i 2 1 2 i. We denote by m(cid:126) = (m ...m ) the resource Diminishing returns: Allocating more of memory t t,1 t,N allocationvector,whereclearlythealgorithmcannot does not increase the per-memory unit expected re- allocatemorethan100%oftheresourcenorallocate ward, that is, f(m +δ)−f(m ) ≥ f(m +δ)− 1 1 2 (cid:80) negative resources, thus m ≤ 1 and m ≥ f(m )form ≤ m . i t,i t,i 2 1 2 0,∀t,i. Once the resources are allocated, or parti- These two properties were also observed in our tioned, each thread i receives a stochastic bounded taskofallocatingmemory.InFig.1,mentionedpre- reward s ∈ [0,1] based on these resources. We viously,wecanclearlyobserve,thatthesetwoabove t,i denote the reward vector by (cid:126)s = (s ...s ). propertiesholdforrealapplications. t t,1 t,N We assume that the expectation of each reward is Theaboveobservationmotivatedustoproposethe given by a function of the resource allocated, that is following simple family of functions, with two pa- E[s ] = f (m ), where f (x) is a fixed unknown rametersγ andβ , t,i i t,i i i i (or partly unknown) function. In our application of f (m ;γ ) = γ ·mβi allocating memory to threads, the algorithm reward i i i i i isthehit-rateobtainedforthespecificallocation.We where γ ,β ∈ [0,1]. The parameter γ indicates the denote by f(cid:126) = (f (·)...f (·)), the vector of ex- i i i 1 N maximalexpectedrewardofthreadiifallresources pectedreward-functions,andby, are allocated to it, as f (1;γ ) = γ , and thus is i i i boundedbyaunit,themaximalpossibleexpectedre- N (cid:16) (cid:17) (cid:88) ρ f(cid:126),m(cid:126) = fi(mi), ward.βi isacurvatureparameterandisboundedby a linear line. For β ≈ 0 an infinitesimal amount of i resource allows maximal gain, while for larger val- The expected reward of allocation m(cid:126) with reward ues of β the hit-rate - memory dependency is closer functionsf(cid:126).GivenasetofN functionsf(cid:126),anoptimal tolinear. allocationmaximizestheexpectedrewardandgiven For a given β(cid:126), we identify a concave function (cid:110) (cid:111) by m(cid:126)∗ = argmaxm(cid:126) ρ f(cid:126),m(cid:126) subject to (cid:80)imi ≤ f(m) with the single associated parameter (cid:126)γ. We 3 • Input:β(cid:126) • Initialization: Play N steps, where for each arm i, allocate all memory to thread i, m = t,i δ andreceiverewards . t,i t,i • Fort = N +1...T – Computeestimates: γˆ t,i – ComputeUCBγ = min{1,γˆ }+e (cid:101)t,i t,i t,i – Allocatememoryaccordingto{γ } (cid:101)t – Executewithm(cid:126) allocation t – Receivereward(cid:126)s ∈ {0,1}N t Figure1:Expectedhit-ratevsmemoryallocationfor bzip and gcc applications. Error-bars indicate a unit standard-deviation.Highervaluesindicatebetterper- Figure2:UCB-RAAlgorithmtoallocatememoryto formance. threads. abuse notation and write the expected instantaneous (cid:16) (cid:17) rewardas,ρ (cid:126)γ,m(cid:126);β(cid:126) = (cid:80)N γ mβi ,whichcanbe i i i bination as an arm. We can now play with all arms computed analytically if there is a shared parameter using the existing MAB algorithms (i.e. UCB1 of β acrossallthreads.Weusethatpropertyinthecon- Auer et al. [4]). However, such an approach ignores vergenceproof. the structure of the problem and, since the actions Weusethefactthatρ((cid:126)γ,m(cid:126))isaweightedβ-norm are exponential in the number of threads, that ap- oftheallocationvectorm(cid:126) tofindtheoptimalalloca- proach is not feasible in practice. We propose two tionm(cid:126)∗ whentheparametervector(cid:126)γ isknown. algorithms. The first algorithm performs an initial Lemma 1. Assuming β = β,∀i, the opti- pure exploration stage, and then a pure exploitation i mal allocation which maximizes the gain m(cid:126)∗ = stage. The second algorithm is a UCB-like algo- argmax ρ((cid:126)γ,m(cid:126)(cid:48))isgivenby, rithm, which fundamentally trades-off exploration- m(cid:126)(cid:48) exploitation.Weanalyzethealgorithmsintheshared 1 γ1−β β case,whileourexperimentsalsodonefortheper- m∗ = i , (1) i N 1 threadβ parameters. (cid:80) γ1−β i j j=1 We start with the first algorithm. When the num- andtheexpectedrewardisgivenby, ber of rounds T is known, the algorithm performs maxρ(cid:0)(cid:126)γ,m(cid:126)(cid:48)(cid:1) = ρ((cid:126)γ,m(cid:126)∗) = (cid:107)(cid:126)γ(cid:107) . (2) pure-exploitationforξT rounds,allocatinganequal m(cid:126)(cid:48) 1−1β amount of 1/N memory to all threads. At this point thealgorithmcomputesestimationoftheparameters For brevity, the proof is not given here, but it can γˆ . In the remaining rounds, the algorithm allocates beeasilyderivedbyapplyingHo¨lderinequality. i m (γˆ ...γˆ )using(1)asiftheestimatesaretheop- i 1 N timalparameters.WecallthatalgorithmFETE(First 4 Algorithms ExploreThenExploit)anditissummarizedinFig.4. Asimpleapproachforsolvingourproblemistodis- We use the following linear estimator for γ from i crete the allocation space of m(cid:126) and treat each com- pairs {(m ,s )}T (all parameters are subjected t,i t,i t=1 4 • Input 0 ≤ ξ ≤ 1 the exploration-exploitation tradeoff parameter, β the system parameter, horizonT. • For t = 1,...,ξT: allocate m = 1 for all t,i N threadsi = 1···N t (cid:80) mβτ,i·sτ,i • ComputeEstimates:γˆ = τ=1 i t (cid:80) m2β τ,i τ=1 • For t = ξT +1,..., T: allocate memory ac- cordingtoestimatesγˆ using(1): (a) 1 γˆ1−β m = i ,i = 1···N t,i N 1 (cid:80)γˆ1−β i i=1 Figure 4: First exploit then explore algorithm for memoryallocation. OnecanshowthattheMMSEestimatorisgivenby (b) (cid:80)t mβ ·s τ=1 τ,i τ,i γˆ = Figure3:Syntheticdataresults.(a)Performanceof t,i (cid:80)t m2β five algorithms on synthetic data with fixed N = 8 τ=1 τ,i and β = 0.1. The cumulative hit-rate (reward) is It is also concentrated around its mean, applying shown vs number of iterations. Five algorithms are Hoeffdinginequality[10] evaluated: Uniform (Fair-Share), (cid:15)-greedy (with or (cid:32) t (cid:33) (cid:88) without the model), UCB-RA, and the FETE algo- Pr[|γˆ −γ | > (cid:15)] ≤ 2exp −2(cid:15)2· m2β . t,i i τ rithm (b) Algorithms performance for the case of τ=1 known but different β’s. FETE produces good re- (4) sults,UCB-RApresentsthebestperformance. An interesting property of that concentration in- equality is that the variance is monotonically de- creasingintheamountofresourcesanarmreceives. tothreadi) However, dependency is not linear. The algorithm must perform efficient exploration steps given this (cid:80)t ω ·s γˆ = τ=1 τ τ . (3) property. t (cid:80)t ω ·mβ Thefollowingtheoremboundstheregretoftheal- τ=1 τ τ gorithm, by setting ξ as a function of the number of roundsT. It is easy to show that this general formulation of a linearestimator,givenasetofcoefficients ωτ,i ∈ Theorem 1. Assuming βi = β,∀i, the regret of τ=1,···,t the algorithm in Fig. 4 is upper bounded by, R ≤ R,isunbiased: ξTN+√2N√1 √T(cid:112)log(2/δ).Furthermore,plug- ξ (cid:113) E[γˆ ] = E(cid:34) (cid:80)tτ=1ωτ ·sτ (cid:35) = (cid:80)tτ=1ωτ,i·γmβτ,i = γging the opt(cid:113)imal value, ξopt = 3 ln2(T2δ) , yields, t (cid:80)tτ=1ωτ ·mβτ (cid:80)tτ=1ωτ,i·mβτ,i R ≤ 2NT23 3 ln(22δ) = O(T23). 5 IncontrasttotheFETEalgorithmwhichseparates We first assume shared β and random values of the exploration and exploitation to distinct epochs, (cid:126)γ.Wevariedthenumberofthreads-N.Oncethese we now propose an alternative algorithm which in- parameterswereset,weexecutedeachalgorithmfor herently combines exp-exp, using the UCB tech- 10,000 iterations, and computed reward and regret nique. with respect to the optimal allocation (according to Our algorithmic approach for the second algo- thevaluesetfor(cid:126)γ andβ). rithm is inspired by the UCB1 algorithm of Auer et Next, we ran our algorithms for known values of al. [4]. However, since each arm reward is a func- different β, one per process. Yet, since knowing the tionofitsgivenresources,weprovideateachtimet β’s is a strong assumption, we assumed that the β’s a Model Upper Confidence Bound (which is simply areknownuptoasmalldifference.Thisassumption a UCB on the model parameters) which we obtain isrealistic,asweobservedinrealdatathatthereare bythestatisticalmodel,andpreviousallocationsand roughlytwoclustersofprograms:memory-intensive rewards. We then optimize the allocation according andnon-intensive.Memory-intensiveprogramshave to the optimistic model, rather than the current esti- lower β’s in contrast with the non-intensive ones. matedone. This clustering can be performed off-line, with re- Inthegeneralcase,ouralgorithmestimatesmodel spect to the program’s characteristics. For the FETE parameters {γˆt,i} and computes optimal allocation algorithm we simply took the highest β as a shared withrespecttoUCBuponmodelvariables.Thisen- one. suresthatallthreadswillreceiveenoughmemoryto We compare five algorithms: (1) Fair-Share, explore (estimate) well, and that the parameters es- which is simply a uniform allocation, (2)+(3) (cid:15) - t timators will converge quickly enough to their true greedy, which performs a random exploration step values. with probability (cid:15) , or otherwise exploits. We com- t We call the algorithm UCB-RA for resource allo- pare two versions of (cid:15) -greedy: the first uses the t cation, and it is summarized in Fig. 2. In each step, modelintheexploitationstepwhiletheseconduses we provide a UCB e on the γ estimator. In the t,i t,i the empiric best-so-far allocation. The parameter (cid:15) t general case, we set et,i using (4). For the analysis, decreases over time, and is given by: (cid:15)t = √(cid:15)0 We we again assume that β is equal for all threads. For t i tried different values of (cid:15) , and took the one with 0 thatcase,wesete = e = t−η forη = 0.5(1−β). t,i t bestperformance.(4)TheexplorethenexploitFig.4. Thistermisclearlynotoptimal,sinceattimetithas (5)UCB-RA of Fig. 2. The methods of Liu et al. shared value for all threads, regardless of their per- [2], Suh et al. [7], Bitirgen et al. [8] are not relevant formance so far. This choice of η allows us to prove tooursettings,asthefirstisnotdesignedtooptimize thefollowingtheorem,whichisprovidedherewith- thecumulativehit-rate,thesecondusedapre-defined outproof,forspacelimitation. model,andthethirdassumesthestatisticsareknown (i.e.,thereisaseparatetrainingphase). Theorem 2. The regret of the algorithm of Fig. 2 is (cid:16) 1+β(cid:17) Experiments are summarized in Fig. 3. The case upperboundedby,O(cid:101) F(N,β)+T 2 ,wherethe of same β is not a realistic case, and was simu- functionF isindependentofT. lated to support the theoretical analysis. One can clearly observe the ”knee” of the FETE algorithm, 5 Experimental results which mark the exploration-exploitation transition point. The uniform and (cid:15)-greedy algorithm w\o the We evaluate our algorithms’ performance on both model suffer a linear regret. In the case of differ- synthetic and real data. Starting with the synthetic ent and partly-known β’s (as explained previously), data, we generated data using that model: a thread our UCB-RA algorithm obtain the best results (see producesaBernoullidistributedbinarystochasticre- Fig.3(b)). ward,suchthatPr(s = 1) = γ mβi. We performed extensive experiments to evaluate t,i i t,i 6 (a) (b) (c) Figure 5: Synthetic data results. Algorithms performance for the case of known but different β’s. FETE producesgoodresults,UCB-RApresentsbetterones.Tough(cid:15)-greedypresentsthebestresults-itsstochas- ticityparameterhastobetunedmanually,whichmeansitislessrobust. the algorithms in a real-world setting. The task is to Performancefor2,3and4programcombinations divide L1-cache among threads which are executed issummarizedinFig.5.Eachpointinthegraphrep- onthecore. resents a specific combination of applications to be We chose six programs belong to the SPEC- executed.Thex-axisisanindexofaspecificcombi- CPU2006 benchmark: bzip, lbm, mcf, waves and nation.The y-axisisthe IPCnormalizedbyIPC opt 2 gcc instances with different inputs. We executed (1isthemaximalvalue).Theoptimalallocationwas each of the six programs and recorded all the mem- computedbyanexhaustivesearchintheoptionalal- ory accesses, both read and write. We then divided locationsspace. each memtrace to T distinct consecutive segments. For each algorithm, higher points indicate better Wesimulatedeachofthesememtracesusingacache performance. In the left panel we choose 2 out of 6 simulator,with2Kavailable2-setcache,dividedinto programs (15 combinations), in 13 of which UCB- blocksofsize16bit. RA was better than FairShare. In the middle graph Weevaluatecombinationsof2,3or4outofthe6 - 19 out of 20, and in the right graph - 13 out of 15. programs. For example, we started the execution of Notethattherewardtendstobemoresignificantthan bzipandgccatthesametime,andsotheyneededto FairSharewhentheproblemishard,i.e.,thehitrate sharememory. is low. Note that though (cid:15)-greedy achieves high per- Two metrics are used: the hit-rate, which is the formance, we still needed to set its parameters, and frequency of memory accesses that were stored in thusislessrobust. the cache, and instructions per cycle (IPC). Specif- ically, we computed the memory dependent IPC 6 Discussion (and not the total IPC, which depends on many parameters). For simplicity, we replaced the different Weinvestigatedstatisticalmethodstoallocatemem- cache hierarchy timings and probabilities with one ory to threads. We proposed a simple model for the term: Memory , and used the following equation time problem, that accurately captures the properties of tocomputetheIPC: the real memory allocation problem. We provided (IPC )−1 = MR×Memory +(1−MR)×Cacthweo alg,orithms for the task, and performed an em- mem time time piricalstudywithbothsyntheticandreal-worlddata. where MR is the average miss-rate of a program, Weexecutedseveralrealapplicationsinacontrolled Memory and Cache are the time (in cycles) memory environment and analyzed a few allocation time time needed for a memory access and cache access. The strategies.Thememory-UCBoutperformedallother formerisabout20timesthelatter. methods. 7 Although we have restricted our discussion to al- [10] W. Hoeffding, “Probability inequalities for sums of locatingmemorytothreads,thetoolsdevelopedhere bounded random variables,” Journal of the American Statistical Association, vol. 58, no. 301, pp. 13–30, canbeusedinothercontexts,suchasallocatingmain March 1963. [Online]. Available: http://www.jstor.org/ memory and cores to processes, allocating network stable/2282952? bandwidthtoclients,andsoon. References [1] T.L.LaiandH.Robbins,“Asymptoticallyefficientadap- tive allocation rules,” Advances in applied mathematics, vol.6,no.1,pp.4–22,1985. [2] C.Liu,A.Sivasubramaniam,andM.Kandemir,“Organiz- ingthelastlineofdefensebeforehittingthememorywall forcmps,”inSoftware,IEEProceedings-. IEEE,2004, pp.176–185. [3] S. Bubeck and N. Cesa-Bianchi, “Regret analysis of stochastic and nonstochastic multi-armed bandit prob- lems,” Foundations and Trends in Machine Learning, vol.5,no.1,pp.1–122,2012. [4] P.Auer,N.Cesa-Bianchi,andP.Fischer,“Finite-timeanal- ysisofthemultiarmedbanditproblem,”Machinelearning, vol.47,no.2-3,pp.235–256,2002. [5] Y. Gai, B. Krishnamachari, and R. Jain, “Combina- torial network optimization with unknown variables: Multi-armed bandits with linear rewards,” arXiv preprint arXiv:1011.4748,2010. [6] W. Chen, Y. Wang, and Y. Yuan, “Combinatorial multi- armed bandit: General framework and applications,” in Proceedingsofthe30thInternationalConferenceonMa- chineLearning(ICML-13),2013,pp.151–159. [7] G.E.Suh,S.Devadas,andL.Rudolph,“Anewmemory monitoringschemeformemory-awareschedulingandpar- titioning,” in High-Performance Computer Architecture, 2002. Proceedings. Eighth International Symposium on. IEEE,2002,pp.117–128. [8] R.Bitirgen,E.Ipek,andJ.F.Martinez,“Coordinatedman- agementofmultipleinteractingresourcesinchipmultipro- cessors:Amachinelearningapproach,”inProceedingsof the 41st annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 2008, pp. 318–329. [9] T.Lattimore,K.Crammer,andC.Szepesva´ri,“Optimalre- sourceallocationwithsemi-banditfeedback,”inProceed- ings of the 30th Conference on Uncertainty in Artificial Intelligence(UAI),2014. 8

Bandits meet Computer Architecture: Designing a Smartly-allocated Cache PDF

0.34 MB·

by Yonatan Glassner

#journals #arxiv

Checking for file health...

Save to my drive

Quick download

Download

Download Bandits meet Computer Architecture: Designing a Smartly-allocated Cache PDF Free - Full Version

by Yonatan Glassner| 0.34

Download Bandits meet Computer Architecture: Designing a Smartly-allocated Cache by Yonatan Glassner in PDF format completely FREE. No registration required, no payment needed. Get instant access to this valuable resource on PDFdrive.to!

Free Download PDF

About Bandits meet Computer Architecture: Designing a Smartly-allocated Cache

No description available for this book.

Detailed Information

Author:	Yonatan Glassner
File Size:	0.34
Format:	PDF
Price:	FREE

Download Free PDF

Safe & Secure Download - No registration required

Why Choose PDFdrive for Your Free Bandits meet Computer Architecture: Designing a Smartly-allocated Cache Download?

100% Free: No hidden fees or subscriptions required for one book every day.
No Registration: Immediate access is available without creating accounts for one book every day.
Safe and Secure: Clean downloads without malware or viruses
Multiple Formats: PDF, MOBI, Mpub,... optimized for all devices
Educational Resource: Supporting knowledge sharing and learning

Frequently Asked Questions

Is it really free to download Bandits meet Computer Architecture: Designing a Smartly-allocated Cache PDF?

Yes, on https://PDFdrive.to you can download Bandits meet Computer Architecture: Designing a Smartly-allocated Cache by Yonatan Glassner completely free. We don't require any payment, subscription, or registration to access this PDF file. For 3 books every day.

How can I read Bandits meet Computer Architecture: Designing a Smartly-allocated Cache on my mobile device?

After downloading Bandits meet Computer Architecture: Designing a Smartly-allocated Cache PDF, you can open it with any PDF reader app on your phone or tablet. We recommend using Adobe Acrobat Reader, Apple Books, or Google Play Books for the best reading experience.

Is this the full version of Bandits meet Computer Architecture: Designing a Smartly-allocated Cache?

Yes, this is the complete PDF version of Bandits meet Computer Architecture: Designing a Smartly-allocated Cache by Yonatan Glassner. You will be able to read the entire content as in the printed version without missing any pages.

Is it legal to download Bandits meet Computer Architecture: Designing a Smartly-allocated Cache PDF for free?

https://PDFdrive.to provides links to free educational resources available online. We do not store any files on our servers. Please be aware of copyright laws in your country before downloading.

The materials shared are intended for research, educational, and personal use in accordance with fair use principles.