llm
gpu_energy(model_active_parameter_count, output_token_count, batch_size, gpu_energy_alpha, gpu_energy_beta, gpu_energy_gamma)
Compute energy consumption of a single GPU.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_active_parameter_count |
float
|
Number of active parameters of the model (in billion). |
required |
output_token_count |
float
|
Number of generated tokens. |
required |
batch_size |
int
|
Number of requests handled concurrently by the server. |
required |
gpu_energy_alpha |
float
|
Alpha coefficient of the energy regression. |
required |
gpu_energy_beta |
float
|
Beta coefficient of the energy regression. |
required |
gpu_energy_gamma |
float
|
Beta coefficient of the energy regression. |
required |
Returns:
| Type | Description |
|---|---|
ValueOrRange
|
The energy consumption of a single GPU in kWh. |
Source code in ecologits/impacts/llm.py
generation_latency(model_active_parameter_count, output_token_count, batch_size, latency_alpha, latency_beta, latency_gamma, request_latency)
Compute the token generation latency in seconds.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_active_parameter_count |
float
|
Number of active parameters of the model (in billion). |
required |
output_token_count |
float
|
Number of generated tokens. |
required |
batch_size |
int
|
Number of requests handled concurrently by the server. |
required |
latency_alpha |
float
|
Alpha coefficient of the latency regression. |
required |
latency_beta |
float
|
Beta coefficient of the latency regression. |
required |
latency_gamma |
float
|
Gamma coefficient of the latency regression. |
required |
Returns:
| Type | Description |
|---|---|
ValueOrRange
|
The token generation latency in seconds. |
Source code in ecologits/impacts/llm.py
model_required_memory(model_total_parameter_count, model_quantization_bits)
Compute the required memory to load the model on GPU.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_total_parameter_count |
float
|
Number of parameters of the model (in billion). |
required |
model_quantization_bits |
int
|
Number of bits used to represent the model weights. |
required |
Returns:
| Type | Description |
|---|---|
float
|
The amount of required GPU memory to load the model. |
Source code in ecologits/impacts/llm.py
gpu_required_count(model_required_memory, gpu_memory)
Compute the number of required GPU to store the model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_required_memory |
float
|
Required memory to load the model on GPU. |
required |
gpu_memory |
float
|
Amount of memory available on a single GPU. |
required |
Returns:
| Type | Description |
|---|---|
int
|
The number of required GPUs to load the model. |
Source code in ecologits/impacts/llm.py
server_energy(generation_latency, server_power, server_gpu_count, gpu_required_count, batch_size)
Compute the energy consumption of the server.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
generation_latency |
float
|
Token generation latency in seconds. |
required |
server_power |
float
|
Power consumption of the server in kW. |
required |
server_gpu_count |
int
|
Number of available GPUs in the server. |
required |
gpu_required_count |
int
|
Number of required GPUs to load the model. |
required |
batch_size |
int
|
Number of requests handled concurrently by the server. |
required |
Returns:
| Type | Description |
|---|---|
float
|
The energy consumption of the server (GPUs are not included) in kWh. |
Source code in ecologits/impacts/llm.py
request_energy(datacenter_pue, server_energy, gpu_required_count, gpu_energy)
Compute the energy consumption of the request.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
datacenter_pue |
float
|
Power Usage Effectiveness of the data center. |
required |
server_energy |
float
|
Energy consumption of the server in kWh. |
required |
gpu_required_count |
int
|
Number of required GPUs to load the model. |
required |
gpu_energy |
ValueOrRange
|
Energy consumption of a single GPU in kWh. |
required |
Returns:
| Type | Description |
|---|---|
ValueOrRange
|
The energy consumption of the request in kWh. |
Source code in ecologits/impacts/llm.py
request_usage_gwp(request_energy, if_electricity_mix_gwp)
Compute the Global Warming Potential (GWP) usage impact of the request.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
request_energy |
ValueOrRange
|
Energy consumption of the request in kWh. |
required |
if_electricity_mix_gwp |
float
|
GWP impact factor of electricity consumption in kgCO2eq / kWh. |
required |
Returns:
| Type | Description |
|---|---|
ValueOrRange
|
The GWP usage impact of the request in kgCO2eq. |
Source code in ecologits/impacts/llm.py
request_usage_adpe(request_energy, if_electricity_mix_adpe)
Compute the Abiotic Depletion Potential for Elements (ADPe) usage impact of the request.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
request_energy |
ValueOrRange
|
Energy consumption of the request in kWh. |
required |
if_electricity_mix_adpe |
float
|
ADPe impact factor of electricity consumption in kgSbeq / kWh. |
required |
Returns:
| Type | Description |
|---|---|
ValueOrRange
|
The ADPe usage impact of the request in kgSbeq. |
Source code in ecologits/impacts/llm.py
request_usage_pe(request_energy, if_electricity_mix_pe)
Compute the Primary Energy (PE) usage impact of the request.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
request_energy |
ValueOrRange
|
Energy consumption of the request in kWh. |
required |
if_electricity_mix_pe |
float
|
PE impact factor of electricity consumption in MJ / kWh. |
required |
Returns:
| Type | Description |
|---|---|
ValueOrRange
|
The PE usage impact of the request in MJ. |
Source code in ecologits/impacts/llm.py
request_usage_wcf(request_energy, if_electricity_mix_wue, datacenter_wue, datacenter_pue)
Compute the water usage impact of the request.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
request_energy |
ValueOrRange
|
Energy consumption of the request in kWh. |
required |
if_electricity_mix_wue |
float
|
WCF impact factor of electricity consumption in L / kWh. |
required |
datacenter_wue |
float
|
Water Usage Effectiveness of the data center in L/kWh. |
required |
datacenter_pue |
float
|
Power Usage Effectiveness of the data center. |
required |
Returns: The water usage impact of the request in liters.
Source code in ecologits/impacts/llm.py
server_gpu_embodied_gwp(server_embodied_gwp, server_gpu_count, gpu_embodied_gwp, gpu_required_count)
Compute the Global Warming Potential (GWP) embodied impact of the server
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
server_embodied_gwp |
float
|
GWP embodied impact of the server in kgCO2eq. |
required |
server_gpu_count |
float
|
Number of available GPUs in the server. |
required |
gpu_embodied_gwp |
float
|
GWP embodied impact of a single GPU in kgCO2eq. |
required |
gpu_required_count |
int
|
Number of required GPUs to load the model. |
required |
Returns:
| Type | Description |
|---|---|
float
|
The GWP embodied impact of the server and the GPUs in kgCO2eq. |
Source code in ecologits/impacts/llm.py
server_gpu_embodied_adpe(server_embodied_adpe, server_gpu_count, gpu_embodied_adpe, gpu_required_count)
Compute the Abiotic Depletion Potential for Elements (ADPe) embodied impact of the server
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
server_embodied_adpe |
float
|
ADPe embodied impact of the server in kgSbeq. |
required |
server_gpu_count |
float
|
Number of available GPUs in the server. |
required |
gpu_embodied_adpe |
float
|
ADPe embodied impact of a single GPU in kgSbeq. |
required |
gpu_required_count |
int
|
Number of required GPUs to load the model. |
required |
Returns:
| Type | Description |
|---|---|
float
|
The ADPe embodied impact of the server and the GPUs in kgSbeq. |
Source code in ecologits/impacts/llm.py
server_gpu_embodied_pe(server_embodied_pe, server_gpu_count, gpu_embodied_pe, gpu_required_count)
Compute the Primary Energy (PE) embodied impact of the server
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
server_embodied_pe |
float
|
PE embodied impact of the server in MJ. |
required |
server_gpu_count |
float
|
Number of available GPUs in the server. |
required |
gpu_embodied_pe |
float
|
PE embodied impact of a single GPU in MJ. |
required |
gpu_required_count |
int
|
Number of required GPUs to load the model. |
required |
Returns:
| Type | Description |
|---|---|
float
|
The PE embodied impact of the server and the GPUs in MJ. |
Source code in ecologits/impacts/llm.py
request_embodied_gwp(server_gpu_embodied_gwp, server_lifetime, generation_latency, batch_size)
Compute the Global Warming Potential (GWP) embodied impact of the request.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
server_gpu_embodied_gwp |
float
|
GWP embodied impact of the server and the GPUs in kgCO2eq. |
required |
server_lifetime |
float
|
Lifetime duration of the server in seconds. |
required |
generation_latency |
ValueOrRange
|
Token generation latency in seconds. |
required |
batch_size |
int
|
Number of requests handled concurrently by the server. |
required |
Returns:
| Type | Description |
|---|---|
ValueOrRange
|
The GWP embodied impact of the request in kgCO2eq. |
Source code in ecologits/impacts/llm.py
request_embodied_adpe(server_gpu_embodied_adpe, server_lifetime, generation_latency, batch_size)
Compute the Abiotic Depletion Potential for Elements (ADPe) embodied impact of the request.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
server_gpu_embodied_adpe |
float
|
ADPe embodied impact of the server and the GPUs in kgSbeq. |
required |
server_lifetime |
float
|
Lifetime duration of the server in seconds. |
required |
generation_latency |
ValueOrRange
|
Token generation latency in seconds. |
required |
batch_size |
int
|
Number of requests handled concurrently by the server. |
required |
Returns:
| Type | Description |
|---|---|
ValueOrRange
|
The ADPe embodied impact of the request in kgSbeq. |
Source code in ecologits/impacts/llm.py
request_embodied_pe(server_gpu_embodied_pe, server_lifetime, generation_latency, batch_size)
Compute the Primary Energy (PE) embodied impact of the request.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
server_gpu_embodied_pe |
float
|
PE embodied impact of the server and the GPUs in MJ. |
required |
server_lifetime |
float
|
Lifetime duration of the server in seconds. |
required |
generation_latency |
ValueOrRange
|
Token generation latency in seconds. |
required |
batch_size |
int
|
Number of requests handled concurrently by the server. |
required |
Returns:
| Type | Description |
|---|---|
ValueOrRange
|
The PE embodied impact of the request in MJ. |
Source code in ecologits/impacts/llm.py
compute_llm_impacts_dag(model_active_parameter_count, model_total_parameter_count, output_token_count, request_latency, if_electricity_mix_adpe, if_electricity_mix_pe, if_electricity_mix_gwp, if_electricity_mix_wue, datacenter_pue, datacenter_wue, model_quantization_bits=MODEL_QUANTIZATION_BITS, gpu_energy_alpha=GPU_ENERGY_ALPHA, gpu_energy_beta=GPU_ENERGY_BETA, gpu_energy_gamma=GPU_ENERGY_GAMMA, latency_alpha=LATENCY_ALPHA, latency_beta=LATENCY_BETA, latency_gamma=LATENCY_GAMMA, gpu_memory=GPU_MEMORY, gpu_embodied_gwp=GPU_EMBODIED_IMPACT_GWP, gpu_embodied_adpe=GPU_EMBODIED_IMPACT_ADPE, gpu_embodied_pe=GPU_EMBODIED_IMPACT_PE, server_gpu_count=SERVER_GPUS, server_power=SERVER_POWER, server_embodied_gwp=SERVER_EMBODIED_IMPACT_GWP, server_embodied_adpe=SERVER_EMBODIED_IMPACT_ADPE, server_embodied_pe=SERVER_EMBODIED_IMPACT_PE, server_lifetime=HARDWARE_LIFESPAN, batch_size=BATCH_SIZE)
Compute the impacts dag of an LLM generation request.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_active_parameter_count |
ValueOrRange
|
Number of active parameters of the model (in billion). |
required |
model_total_parameter_count |
ValueOrRange
|
Number of parameters of the model (in billion). |
required |
output_token_count |
float
|
Number of generated tokens. |
required |
request_latency |
float
|
Measured request latency in seconds. |
required |
if_electricity_mix_adpe |
float
|
ADPe impact factor of electricity consumption in kgSbeq / kWh (Antimony). |
required |
if_electricity_mix_pe |
float
|
PE impact factor of electricity consumption in MJ / kWh. |
required |
if_electricity_mix_gwp |
float
|
GWP impact factor of electricity consumption in kgCO2eq / kWh. |
required |
if_electricity_mix_wue |
float
|
WCF impact factor of electricity consumption in L / kWh. |
required |
datacenter_wue |
ValueOrRange
|
Water Usage Effectiveness of the data center in L/kWh. |
required |
datacenter_pue |
ValueOrRange
|
Power Usage Effectiveness of the data center. |
required |
model_quantization_bits |
Optional[int]
|
Number of bits used to represent the model weights. |
MODEL_QUANTIZATION_BITS
|
gpu_energy_alpha |
Optional[float]
|
Alpha coefficient of the "GPU energy" regression. |
GPU_ENERGY_ALPHA
|
gpu_energy_beta |
Optional[float]
|
Beta coefficient of the "GPU energy" regression. |
GPU_ENERGY_BETA
|
gpu_energy_gamma |
Optional[float]
|
Gamma coefficient of the "GPU energy" regression. |
GPU_ENERGY_GAMMA
|
latency_alpha |
Optional[float]
|
Alpha coefficient of the "Latency" regression. |
LATENCY_ALPHA
|
latency_beta |
Optional[float]
|
Beta coefficient of the "Latency" regression. |
LATENCY_BETA
|
latency_gamma |
Optional[float]
|
Gamma coefficient of the "Latency" regression. |
LATENCY_GAMMA
|
gpu_memory |
Optional[float]
|
Amount of memory available on a single GPU. |
GPU_MEMORY
|
gpu_embodied_gwp |
Optional[float]
|
GWP embodied impact of a single GPU. |
GPU_EMBODIED_IMPACT_GWP
|
gpu_embodied_adpe |
Optional[float]
|
ADPe embodied impact of a single GPU. |
GPU_EMBODIED_IMPACT_ADPE
|
gpu_embodied_pe |
Optional[float]
|
PE embodied impact of a single GPU. |
GPU_EMBODIED_IMPACT_PE
|
server_gpu_count |
Optional[int]
|
Number of available GPUs in the server. |
SERVER_GPUS
|
server_power |
Optional[float]
|
Power consumption of the server in kW. |
SERVER_POWER
|
server_embodied_gwp |
Optional[float]
|
GWP embodied impact of the server in kgCO2eq. |
SERVER_EMBODIED_IMPACT_GWP
|
server_embodied_adpe |
Optional[float]
|
ADPe embodied impact of the server in kgSbeq. |
SERVER_EMBODIED_IMPACT_ADPE
|
server_embodied_pe |
Optional[float]
|
PE embodied impact of the server in MJ. |
SERVER_EMBODIED_IMPACT_PE
|
server_lifetime |
Optional[float]
|
Lifetime duration of the server in seconds. |
HARDWARE_LIFESPAN
|
batch_size |
Optional[float]
|
The number of requests handled concurrently by the server, default set to 16. |
BATCH_SIZE
|
Returns: The environmental impacts dag with all intermediate states.
Source code in ecologits/impacts/llm.py
386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 | |
compute_llm_impacts(model_active_parameter_count, model_total_parameter_count, output_token_count, if_electricity_mix_adpe, if_electricity_mix_pe, if_electricity_mix_gwp, if_electricity_mix_wue, datacenter_pue, datacenter_wue, request_latency=None, **kwargs)
Compute the impacts of an LLM generation request.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_active_parameter_count |
ValueOrRange
|
Number of active parameters of the model (in billion). |
required |
model_total_parameter_count |
ValueOrRange
|
Number of total parameters of the model (in billion). |
required |
output_token_count |
float
|
Number of generated tokens. |
required |
if_electricity_mix_adpe |
float
|
ADPe impact factor of electricity consumption of kgSbeq / kWh (Antimony). |
required |
if_electricity_mix_pe |
float
|
PE impact factor of electricity consumption in MJ / kWh. |
required |
if_electricity_mix_gwp |
float
|
GWP impact factor of electricity consumption in kgCO2eq / kWh. |
required |
if_electricity_mix_wue |
float
|
WCF impact factor of electricity consumption in L / kWh. |
required |
datacenter_wue |
ValueOrRange
|
Water Usage Effectiveness of the data center in L/kWh. |
required |
datacenter_pue |
ValueOrRange
|
Power Usage Effectiveness of the data center. |
required |
request_latency |
Optional[float]
|
Measured request latency in seconds. |
None
|
**kwargs |
Any
|
Any other optional parameter. |
{}
|
Returns: The impacts of an LLM generation request.
Source code in ecologits/impacts/llm.py
484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 | |