This paper proposes a new QoS architecture that provides end-to-end QoS support. The architecture is based on scalable per-flow signaling and resource reservati on for aggregates of flows at both core and access networks. The underlying architecture is based on DiffServ, where the edge nodes perform policing of the incoming aggregates in order to ensure conformance to the aggregate reservation. Although in this model the signaling is per-flow based, several techniques and algorithms are developed aiming at the minimization of the computational complexity and, therefore, the improvement of the signaling scalability. More specifically, a label switching mechanism is developed with the goal of reducing the signaling message processing time at each router. Moreover, the architecture includes soft reservations where the expiration timers are scalably implemented with a complexity that is low and independent of the number of reservations. This architecture is then able to scalably support both IntServ service models in high speed networks.